Engineered heterodimeric protein domains

ABSTRACT

The present invention provides an engineered multidomain protein including at least two nonidentical engineered domains, each of which contains a protein-protein interaction interface containing amino acid sequence segments derived from two or more existing homologous parent domains, thereby conferring on the engineered domains assembly specificities distinct from assembly specificities of the parent domains. In particular, the engineered domains form heterodimers with one another preferentially over forming homodimers. Methods of designing and using the engineered proteins are also included.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a divisional of U.S. application Ser. No.14/505,653, filed on Oct. 3, 2014, which is a divisional of U.S.application Ser. No. 11/728,048, filed Mar. 23, 2007, which claimspriority to and the benefit of U.S. Provisional Patent Application No.60/785,474, filed on Mar. 24, 2006, the entire contents of each of whichare incorporated by reference herein.

SEQUENCE LISTING

The instant application contains a Sequence Listing which has beensubmitted as a paper copy and in electronic form by reference to thecomputer readable form of the sequence listing submitted in U.S. patentapplication Ser. No. 11/728,048 on Mar. 23, 2007, and is herebyincorporated by reference in its entirety.

FIELD OF THE INVENTION

The invention relates to engineered heterodimeric protein domains andmethods of making the same.

BACKGROUND OF THE INVENTION

Nature provides a large number of homodimeric proteins and proteindomains that fall into families of related proteins. Such proteins anddomains often form homodimers with themselves but do not formheterodimers with other family members. On the other hand, heterodimericor heteromultimeric proteins are often useful. They provide noveltherapeutics and research tools. For example, bispecific antibodies(BsAbs) capable of binding to at least two different antigens havesignificant potential in a wide range of clinical applications astargeting agents for in vitro and in vivo immunodiagnosis and therapy,and for diagnostic immunoassays. In the diagnostic area, BsAbs have beenvery useful in probing the functional properties of cell surfacemolecules and in defining the ability of the different Fc receptors tomediate cytotoxicity (Fanger et al. (1992) Crit. Rev. Immunol.12:101-124, the teachings of which are hereby incorporated byreference.) However, when BsAbs are generated simply by co-expression ofmultiple components that can interact without specificity, a largenumber of species are often generated, and it is often difficult toseparate the desired species from the undesired species. Therefore, itis desirable to have techniques for efficiently making heteromultimers.It is particularly desirable to generate antibody subunits that formheterodimers preferentially over forming homodimers so that BsAbs can bedirectly recovered from recombinant cell culture.

Methods for making heterodimeric proteins have been reported. Forexample, Stahl and Yancopoulos described the use of fusion proteinsincluding two different receptor subunits to form soluble heterodimericreceptors that could bind to a given cytokine in circulation, and thusblock the activity of that cytokine (see U.S. Pat. No. 6,472,179).Carter et al. described a “protuberance-into-cavity” approach forgenerating a heterodimeric Fc moiety (see U.S. Pat. No. 5,807,706).

These existing methods allow constructions of individual heterodimers,but do not provide general techniques for construction of multimericproteins involving multiple domain interactions. Therefore, there is aneed for a general system for designing heterodimeric pairs that canspecifically assemble in an environment containing multiple differentpotential assembly partners.

SUMMARY OF THE INVENTION

The present invention provides a novel approach for designing proteindomains that preferentially heterodimerize or heteromultimerize. Inparticular, the invention uses a “Strand Exchange Engineered Domain”(SEED) strategy to engineer a protein-protein interaction interface thatpromotes heterodimerization or heteromultimerization. The invention alsoprovides proteins containing domains engineered using the method of thepresent invention.

In one aspect, the present invention features a multidomain proteinincluding at least first and second nonidentical engineered domains,each of which contains a protein-protein interaction interfacecontaining amino acid sequence segments derived from two or morenaturally-occurring homologous parent domains, thereby conferring on thefirst and second engineered domains assembly specificities distinct fromassembly specificities of the parent domains, wherein the first andsecond engineered domains form heterodimers with one anotherpreferentially over forming homodimers (e.g., the heterodimersconstitute more than 55%, 65%, 75%, 80%, 85%, 90%, or 95% of the totalamount of dimers). The first and second engineered domains are notantibody variable domains. In some embodiments, the multidomain proteinof the invention includes a first subunit containing the firstengineered domain and a second subunit containing the second engineereddomain. As used herein, an “amino acid sequence segment” includes anysequence segment containing two or more amino acids (e.g., three ormore, four or more, five or more, six or more, seven or more, eight ormore nine or more, or ten or more).

In preferred embodiments, the multidomain protein includes nonidenticaldomains engineered from naturally-occurring homologous parent domainsthat are immunoglobulin superfamily domains, such as, for example,antibody CH3 domains. In particular, the engineered domains are derivedfrom IgG and IgA CH3 domains.

In some embodiments, the multidomain protein of the invention includesengineered domains that are part of polypeptide chains that areconnected by a disulfide bond.

In one embodiment, one of the engineered domains contained in themultidomain protein of the invention includes at least two non-adjacentsequence segments derived from the same parent domain. In anotherembodiment, each of the first and second engineered domains includes atleast two, three, or four or more non-adjacent sequence segments derivedfrom the same parent domain. In another embodiment, at least one of theengineered domains includes sequence segments from each parent domainthat are at least two amino acids in length. In another embodiment, atleast one of the engineered domains includes sequence segments from eachparent domain that are at least three, four, five or six amino acids inlength.

In some embodiments, the multidomain protein of the invention includes afirst bio-active domain. The first bio-active domain may occupy aposition N-terminal or C-terminal to the first engineered domain.

In further embodiments, the multidomain protein may further include asecond bio-active domain in addition to the first bio-active domain. Inone embodiment, the second bio-active domain is associated with thesecond engineered domain and may occupy a position N-terminal orC-terminal to the second engineered domain. In an alternate embodiment,the second bio-active domain is also associated with the firstengineered domain and may occupy a position opposite the firstbio-active domain. For example, the first and second bio-active domainsmay occupy positions N-terminal and C-terminal, respectively, to thefirst engineered domain.

The multidomain protein of the present invention can be used to generatebispecific antibodies. For example, the multidomain protein may includea first bio-active domain containing an antibody variable domain and asecond bio-active domain containing a second antibody variable domainwith distinct specificity.

In another aspect, the invention provides a multidomain protein, whereinthe first bio-active region contains two or more antibody variabledomains of a first specificity or of a first combination ofspecificities. The multidomain protein may also contain a secondbio-active region including two or more antibody variable domains of asecond specificity or second combination of specificities. For example,the multidomain protein may include one or more single-chain Fvmoieties, a diabody (one VH-VL chain), a single-chain diabody [aVH(1)-VL(2)-VH(2)-VL(1)], or other single-chain Fv fused repeats (of thesame or different specificities.

In another aspect, the invention provides a multidomain protein, whereinthe first bio-active region comprises two or more antibody variabledomains of a first specificity or of a first combination ofspecificities. The multidomain protein further comprises a secondbio-active region comprising two or more antibody variable domains of asecond specificity or second combination of specificities that aresubstantially distinct from the first combination of specificities.

The present invention further contemplates a method of colocalizingbio-active domains when administered to a biological system. The methodincludes the step of administering to the biological system themultimeric protein including first and second bio-active domains asdescribed above in various embodiments. In one embodiment, thebiological system is a mammal. In more preferred embodiment, thebiological system is a human.

In another aspect, the present invention provides a multidomain proteinincluding at least first and second nonidentical engineered domains thatmeet at an interface. The interface of the first engineered domaincontains at least two amino acid sequence segments, each segment beingderived from a different naturally-occurring homologous parent domain,thereby conferring an assembly specificity distinct from the assemblyspecificity of the parent domains, wherein the first and secondengineered domains form heterodimers with one another preferentiallyover forming homodimers. In a preferred embodiment, the secondengineered domain also contains at least two amino acid sequencesegments, each segment being derived from a differentnaturally-occurring homologous parent domain, thereby conferring anassembly specificity distinct from the assembly specificity of theparent domains, wherein the first and second engineered domains formheterodimers with one another preferentially over forming homodimers.

In yet another aspect, the present invention provides a multidomainprotein including at least first and second nonidentical engineereddomains that meet at an interface, wherein (1) the first and secondengineered domains are derived from two or more naturally-occurringhomologous parent domains, (2) the interface from the first engineereddomain comprises at least one amino acid sequence segment interactingwith an amino acid sequence segment on the interface of the secondengineered domain derived from the same parent domain, and (3) the firstand second engineered domains form heterodimers with one anotherpreferentially over forming homodimers.

In another aspect, the present invention provides a multimeric proteinincluding a domain with an amino acid sequence derived from two or morehomologous parent domains and an interaction surface on said domain thatmediates multimerization and that comprises amino acids derived frommore than one of the parent domains; and wherein the specificity ofmultimerization is enhanced by the presence of amino acids fromdifferent parent domains. In some embodiments, the domain is part of apolypeptide chain with a disulfide bond that enhances assembly.

In further aspect, the present invention features an engineeredimmunoglobulin domain containing a protein-protein interaction interfaceincluding amino acids from two or more parent immunoglobulin domainssuch that the protein-protein interaction interface confers on theengineered immunoglobulin domain assembly specificities that aredistinct from assembly specificities of the parent immunoglobulindomains, wherein the engineered immunoglobulin domain is not an antibodyvariable domain. In preferred embodiments, the engineered immunoglobulindomain of the invention assembles with a partner domain with enhancedspecificity compared to the parent domains. In some embodiments, thepartner domain is an engineered immunoglobulin domain of the invention.

In yet another aspect, the present invention provides an engineeredimmunoglobulin superfamily domain containing a protein-proteininteraction interface including amino acids from two or more parentimmunoglobulin superfamily domains such that the protein-proteininteraction interface confers on the engineered immunoglobulinsuperfamily domain interaction properties that are distinct frominteraction properties of the parent immunoglobulin superfamily domains.

The invention also provides a multidomain protein comprising anengineered domain with the following properties. Firstly, the engineereddomain comprises a protein-protein interaction interface. Secondly, theengineered domain is homologous to a family of naturally occurringdomains, preferably such that the amino acid sequence of the engineereddomain can be aligned with amino acid sequences of naturally occurringdomains, which can further be aligned with each other. Preferably, thealignment of the amino acid sequences of the naturally occurring domainscorresponds to an alignment of the three-dimensional structures of thenaturally occurring domains. Thirdly, the interaction interface of theengineered domain comprises amino acids from corresponding sequencepositions from two or more naturally-occurring parental domains.Fourthly, the amino acids in the interface of the engineered domain,considered as a group, are not all found in the corresponding interfaceof any single member of the homologous naturally occurring domains.Fifthly, the interaction interface of the engineered domain confersassembly properties distinct from any of the parental domains.Preferably, the assembly properties of the engineered domain aredistinctive because the interaction interface has amino acids from twoor more different parents that make specific contacts with assemblypartners, thus acquiring an assembly specificity that is a hybridbetween the assembly specificities of the parent domains.

Furthermore, the present invention provides nucleic acid encoding amultidomain protein as described in various embodiments above. Inparticular, the present invention provides nucleic acid encoding amultidomain protein including at least one bio-active domain. Thepresent invention also provides cells containing the nucleic acid of theinvention.

In another aspect, the present invention provides a method of designinga multidomain protein with domains that preferentially heterodimerize.The method includes the following steps: (a) selecting a firstpolypeptide, a second polypeptide, a third polypeptide and a fourthpolypeptide, wherein the first and third polypeptides dimerize with eachother, but not with the second or fourth polypeptide, and wherein saidsecond and fourth polypeptides dimerize with each other, (b) composingan amino acid sequence of a first domain from the first and the secondpolypeptides comprising at least one assembly element from the firstpolypeptide, and (c) composing an amino acid sequence of a second domainfrom the third and fourth polypeptides comprising at least one assemblyelement from the third polypeptide, such that the assembly elements fromthe first and third polypeptides assemble with each other, promotingheterodimerization of the first and second domains.

In some embodiments, the method of the invention composes an amino acidsequence of the first domain further including an assembly element fromthe second polypeptide and an amino acid sequence of the second domainfurther including an assembly element from the fourth polypeptide suchthat the assembly elements from the second and fourth polypeptidesassemble with each other, promoting heterodimerization of the first andsecond domains.

In some embodiments, step (b) or step (c) of the above-described methodincludes comparing three-dimensional structures of two or more of thefirst, second, third or fourth polypeptides. In some embodiments,identical first and third polypeptides are selected. In otherembodiments, identical first and third polypeptides are selected andidentical second and fourth polypeptides are selected.

In some embodiments, step (b) or step (c) of the above-described methodincludes comparing aligned amino acid sequences of two or more of thefirst, second, third or fourth polypeptides. In some embodiments,identical first and third polypeptides are selected. In otherembodiments, identical first and third polypeptides are selected andidentical second and fourth polypeptides are selected.

Other features, objects, and advantages of the present invention areapparent in the detailed description that follows. It should beunderstood, however, that the detailed description, while indicatingpreferred embodiments of the present invention, is given by way ofillustration only, not limitation. Various changes and modificationswithin the scope of the invention will become apparent to those skilledin the art from the detailed description.

BRIEF DESCRIPTION OF THE DRAWINGS

The Figures are provided for illustration, not limitation.

FIG. 1A schematically depicts an exemplary method of designing SEEDconstructs. Two related parent domains X and Y are aligned. Thesequences of the two SEED subunits (XY and YX) are then generated bychoosing for one SEED subunit alternating sequence segments from the twoparental sequences, and choosing the complementary sequence segments togenerate the other SEED subunit sequence. SEEDs engineered by thismethod are referred to as “Full” SEEDs.

FIG. 1B schematically depicts a second exemplary method of designingSEED constructs, which is similar to FIG. 1A except that only aminoacids forming the dimerization interface are chosen from one of theparental sequences. SEEDs engineered by this method are also referred toas “Surface” SEEDs.

FIG. 1C depicts diagrammatically exemplary configurations of a SEEDheterodimer, composed of a first daughter SEED (white oval) and a seconddaughter SEED (black oval), and a fusion partner, such as a bioactivedomain (stalked white diamond). The SEED moiety and the fusion partnermay be coupled by a linker segment (not depicted). In configurationswith more than one fusion partner, the fusion partners may be identicalto one another or distinct from one another, although in the diagramsthey are shown generically as a stalked white diamond. The fusionpartner may be N-terminal (A) or C-terminal (B) to the SEED moiety.There may be multiple concatenated fusion partners on one end of a SEED,as in (C), or the fusion partners may be located at opposite ends of aSEED (D). One fusion partner may be placed at N-terminal to a firstdaughter SEED and a second fusion partner may be placed N-terminal (F)or C-terminal (G) to a second daughter SEED. The SEED heterodimer maycontain three (H) or four (I) fusion partners.

FIG. 2 depicts the structural alignment of human IgG1 CH3 (SEQ ID NO:51)and human IgA CH3 (SEQ ID NO:52) domains. Residue numbers are shownabove and below the sequences. IgG1 is numbered according to Kabat EUnumbering, while IgA is sequentially numbered as in the PDB structure1OW0. Bold letters designate the backbone positions that were includedin the alignment described in Table 2 in Example 1. Diamonds designateresidues that contact or come close to the dimerization interface inIgG1 and IgA homodimers.

FIG. 3A depicts the sequence alignments and secondary structure of humanIgA (SEQ ID NO:52), IgG1 (SEQ ID NO:51) and daughter “Surface” SEEDsequences “AG SURF” (SEQ ID NO: 10) and “GA SURF” (SEQ ID NO: 11), whileFIG. 3B depicts the sequence alignments and secondary structure of humanIgA, IgG1 and daughter “Full” SEED sequences “AG SEED” (SEQ ID NO:3) and“GA SEED” (SEQ ID NO:6). IgG1 is numbered according to Kabat EUnumbering, while IgA is sequentially numbered as in the PDB structure1OW0 (native numbers in center of alignment). For the purposes of thisfigure, the sequential numbering of the SEED sequence is interrupted atextra loop residues, which are designated with letters “A”, “B”, etc.(e.g., 18A), to illustrate the structural alignment of the molecules.Strand exchange points are designated by bold sequence letters. The twoexchange points that contain no common residues are italicized. Modeledsecondary structures (arrows above and below sequences) of the two SEEDsillustrate the strand exchanges, and are colored to indicate the mannerin which the domain was divided, as shown in FIGS. 6B and 6C. Whitesegments □ are from IgA; gray segments

are from IgG, and black segments ▪ are common residues at exchangepoints. Twelve (12) residues in IgA segments are underlined. These areresidues that were kept as IgG because of their proximity to the CH3/CH2interface region. These residues are not involved in CH3 dimerization,but they are potentially important for the interaction with CH2, and/orwith the complex with FcRn. Since the CH2 is human IgG for both SEEDs,these residues were kept to maintain both the native CH2/CH3 interactionand the various well-known advantages conferred by FcRn binding.

FIG. 4 is a representation of an IgG antibody molecule illustrating thesymmetry of the CH3 homodimer. The vertical bar designates the axis of2-fold rotational symmetry.

FIG. 5 is a representation of a bispecific, antibody-like moleculehaving two different Fab domains, paired by the heterodimeric SEEDanalogue of the CH3 domain. The hashed, gray portion represents theIgG-derived portion, while the white □ represents the IgA-derivedportion. The symmetry of the CH3 complex is broken in the AG/GAheterodimer, as represented by the “X” on the vertical bar designatingthe axis of two-fold rotational symmetry.

FIGS. 6A-C are schematic representations of the secondary structure ofIgG CH3 and the two CH3-based SEEDs. FIG. 6A depicts the secondarystructure of wild type CH3.

FIG. 6B depicts the secondary structure of the “GA SEED,” and shows thestrand exchange pattern. Gray

represents IgG sequence, white □ represents IgA sequence; and black ▪shows the exchange points, with a broader black band indicating residuesthat are conserved in both IgA and IgG.

FIG. 6C depicts the secondary structure of the “AG SEED,” which containsa pattern opposite to the pattern of the “GA SEED”.

FIGS. 7A-C are ribbon diagram representations of the three-dimensionalstructure of the “GA SEED” and “AG SEED” CH3 domains and of theirputative heterodimeric structure depicting exchange crossover point andCH3 domain interactions. In all diagrams, white or light gray ribbonsrepresent IgA sequence and structure, dark gray corresponds to IgGsequence and structure, and black sections denote where the sequenceexchanges from G to A or vice versa. Aside from the two exchange pointsat 55-56 and 101-102 (numbered according to FIG. 3B), all black residuesare shared by IgA and IgG, in sequence and in basic structure.

FIG. 7A depicts the “GA SEED,” where the N-terminus begins as IgGsequence and ends as IgA after exchanging seven times. In thisstructure, the upper layer of β-strands are in the outside sheet, whilethe layer behind forms the interface with the other CH3 domain.

FIG. 7B depicts the “AG SEED,” beginning with IgA sequence. Here, thefront β-strands form the interface, while the β-strands behind are onthe outside of the dimer.

FIG. 7C depicts the putative heterodimeric structure of the “GA SEED”and “AG SEED.” Translating the structure shown in FIG. 7A over thestructure shown in FIG. 7B brings the interface surfaces together. Theblack residues form an approximate plane that is oriented vertically andperpendicular to the page. All residues to the left are dark gray (IgG),while all residues to the right are white (IgA). Thus, with whiteopposite white and gray opposite gray, the whole of the interface iswell formed, as a fusion of the IgA and IgG interfaces. The alternativehomodimers, (AG/AG and GA/GA) would each have their IgA side juxtaposedto their IgG side (on both sides of the dividing plane), and so aredisfavored.

FIGS. 8A-F, 9A-F, and 10 diagrammatically show a series of proteinmolecules that can be made using the SEED moieties described herein. Forall of these figures, different moieties are indicated as follows. InFIGS. 8A-F and FIGS. 9A-F, polypeptide chains that include the GA SEEDare colored black, while the polypeptide chains that include the AG SEEDare colored white. Within such polypeptide chains, antibody V regionsthat are part of the GA SEED-containing polypeptide chain are black withthin white stripes, while antibody V regions that are part of the AGSEED-containing polypeptide chain are white with thin black stripes.Light chain constant regions are shown with a checkerboard pattern.Antibody hinge regions are shown as thin ovals connected by an “S—S” anda thick line to represent the disulfide bonds between the hinge regions.Polypeptide linkers are represented with dashed lines.

Portions of FIGS. 8A-F, FIGS. 9A-F, and FIG. 10 are numerically labeledas follows. In some cases, to simplify the figures, numerical labels arenot shown, but the identity of the various domains and regions can beinferred from figures with corresponding domains and regions.

-   -   “1” indicates a GA-associated set of heavy and light chain V        regions.    -   “2” indicates an AG-associated set of heavy and light chain V        regions.    -   “3” indicates a GA-associated light chain V region.    -   “4” indicates an Fab region.    -   “5” indicates a GA-associated heavy chain V region.    -   “6” indicates an AG-associated heavy chain V region.    -   “7” indicates a AG-associated light chain V region.    -   “8” indicates a light chain constant region.    -   “9” indicates an Fc region comprising a SEED pair.    -   “10” indicates SEED pair.    -   “11” indicates an artificial linker.    -   “12” indicates a GA-associated single-domain or camelid V        region.    -   “13” indicates an AG-associated single-domain or camelid V        region.    -   “14” indicates a diabody or single chain fused diabody that is        incorporated into the polypeptide chain comprising the GA SEED.    -   “15” indicates a diabody or single chain fused diabody that is        incorporated into the polypeptide chain comprising the AG SEED.    -   “16”, “17”, “18”, or “19” refers to any protein or peptide, such        as a non-Ig domain. Such domains may include, for example,        cytokines, hormones, toxins, enzymes, antigens, and        extracellular domains of cell surface receptors.    -   “20” indicates a canonical homodimeric Fc region.    -   “21” indicates a canonical homodimeric pair of CH3 domains.

FIGS. 8A-F illustrate types of antibody-type SEED configurationscomprising moieties with essentially naturally occurring V regions, suchas the Fab (FIG. 8A and FIG. 8B), single-chain Fab (FIG. 8C and FIG.8D), and single-domain or camelid single-domain V regions (FIG. 8E andFIG. 8F). FIG. 8A, FIG. 8C, and FIG. 8E show molecules comprising anessentially intact Fc region, including CH2 domains, as well as a hinge.FIG. 8B, FIG. 8D, and FIG. 8F show molecules lacking a CH2 domain, inwhich the hinge is optionally replaced by a linker that optionallypossesses or lacks cysteine residues capable of disulfide bonding.

FIGS. 9A-F illustrate types of antibody-type SEED configurationscomprising moieties with artificially configured V regions, such assingle-chain Fvs (FIG. 9A and FIG. 9B), diabodies (FIG. 9C and FIG. 9D),and single-chain Fvs with additional moieties attached to the N- and/orC-termini of the two polypeptide chains (FIG. 9E and FIG. 9F). FIG. 9A,FIG. 9C, and FIG. 9E show molecules comprising an essentially intact Fcregion, including CH2 domains, as well as a hinge. FIG. 9B, FIG. 9D, andFIG. 9F show molecules lacking a CH2 domain, in which the hinge isoptionally replaced by a linker that optionally possesses or lackscysteine residues capable of disulfide bonding.

FIG. 10 diagrammatically illustrates a molecule in which a GA/AG SEEDpair essentially replaces the CH1-CL pairing in an antibody. Additionalmoieties, indicated by X and Y, may be placed at the N-termini of the GAand AG SEEDs. Moiety X and moiety Y can be, for example, a Fab, asingle-chain Fab, a camelid single-domain V region, a single-chain Fv, asingle-chain diabody such as that illustrated by “14” and “15” in FIG.9C and FIG. 9D. Additional moieties may be fused to the C-termini of theCH3 domains indicated by “21”.

FIG. 11A shows an Fc heterodimer produced as described in Example 5, inwhich an AG SEED moiety has an IL-2 moiety fused to its C-terminus. TheCH2 and hinge moieties are identical in this case. FIG. 11B shows anantibody produced as described in Example 7, in which an AG SEED moietyhas an IL-2 moiety fused to its C-terminus. Each antibody domain isrepresented by an oval, and the IL-2 moiety is represented by a whitesquare. The CH2, CH1, hinge, VH, VL, and CL moieties are identical inthis case. The hinge regions are attached by disulfide bonds representedby “S—S” in the figure. The light chain constant region is representedwith a checkerboard pattern. The light chain V region is representedwith a vertical-striped pattern. The VH, CH1, and CH2 regions are black.

FIGS. 12A-C depict the preferential assembly of AG/GA SEEDS intoheterodimers as represented by the results of expression of Fc andFc-IL2 in the same cell. FIG. 12A depicts the possible configurations ofmolecules resulting from coexpression of Fc and Fc-IL2, such that eachdimeric species has a different molecular weight. FIG. 12B depicts anon-reducing SDS gel in which the following samples were loaded: lane1—molecular weight standards; lane 2-4—about 1, 2, and 4 micrograms oftotal protein of the “full” Fc(GA SEED)/Fc(AG SEED)-IL2 expressed fromNS/0 cells; lane 5-7—about 1, 2, and 4 micrograms of total protein ofthe “surface” Fc(GA SEED)/Fc(AG SEED)-IL2 expressed from NS/0 cells;lane 8-10—about 1, 2, and 4 micrograms of total protein of the parentalIgG Fc/Fc-IL2 expressed from NS/0 cells. FIG. 12C is a reducing gelshowing the ratio of expression of IgG-derived Fc and Fc-IL2.

FIGS. 12D-E depict Western blot analysis of non-reduced samples (FIG.12D) and reduced samples (FIG. 12E) of the Fc/Fc-IL2 proteins of FIGS.12B-C. Duplicate samples of “full” Fc(GA SEED)/Fc(AG SEED)-IL2 (lanes 1and 4), “surface” Fc(GA SEED)/Fc(AG SEED)-IL2 (lanes 2 and 5), andparental Fc/Fc-IL2 (lanes 3 and 6) were loaded and the blot was probedusing anti-human IgG Fe (lanes 1-3) and anti-human IL-2 (lanes 4-6)antibodies.

DETAILED DESCRIPTION OF THE INVENTION

The present invention provides methods for designing protein domainsthat preferentially heterodimerize or heteromultimerize. In particular,the invention uses a “Strand Exchange Engineered Domain” (SEED) strategyto engineer a protein-protein interaction interface that promotesheterodimerization or heteromultimerization. The invention also providesmultidomain proteins containing domains engineered using this approach.Thus, the present invention represents a significant advance in proteinengineering.

Various aspects of the invention are described in further detail in thefollowing subsections. The use of subsections is not meant to limit theinvention. Each subsection may apply to any aspect of the invention.

As used herein, a “multidomain protein” includes any protein containingtwo or more domains. The domains may be on single polypeptide; they mayalso be on different polypeptides. “Heteromultimerization” refers tononidentical domains forming a multimeric complex mediated by domaininteractions. A “heteromultimeric protein” is a protein moleculecomprising at least a first subunit and a second subunit, each subunitcontains a nonidentical domain. The heteromultimer can include a“heterodimer” formed by the first and second subunit or can form higherorder structures (e.g., ternary) where subunit polypeptides in additionto the first and second subunit are present. Typically, each subunitcontains a domain. Exemplary structures for the heteromultimer includeheterodimers, heterotrimers, heterotetramers (e.g., a bispecificantibody) and further oligomeric structures.

As used herein, a “domain” includes any region of a polypeptide that isresponsible for selectively assembling with an assembly partner ofinterest (e.g., another domain, ligand, receptor, substrate orinhibitor). Exemplary domains include an immunoglobulin superfamilyconstant domain such as a CH2 or CH3 domain, a receptor binding domain,a ligand binding domain, an enzymatic domain, or any polypeptide thathas been engineered and/or selected to bind to a target. When twodomains assemble with each other, they meet at a protein-proteininteraction interface. As used herein, a “protein-protein interactioninterface,” an “interaction interface,” or an “interface” includes those“contact” residues (amino acid or other non-amino acid residues such ascarbohydrate groups, NADH, biotin, FAD or heme group) in the firstdomain that interact with one or more “contact” residues (amino acid orother non-amino acid groups) in the interface of the second domain. Asused herein, a “contact” residue refers to any amino acid or non-aminoacid residue from one domain that interacts with another amino acid ornon-amino acid residue from a different domain by van der Waals forces,hydrogen bonds, water-mediated hydrogen bonds, salt bridges or otherelectrostatic forces, attractive interactions between aromatic sidechains, the formation of disulfide bonds, or other forces known to oneskilled in the art. Typically, the distance between alpha carbons of twointeracting contact amino acid residues in the interaction interface isno greater than 12 Å. More typically, the distance between alpha carbonsof two interacting contact amino acid residues in the interactioninterface is no greater than 11 Å.

As used herein, a “parent domain” refers to any existing assembly domainas described above that can be used as a parent sequence for designingan engineered domain by the strand exchange strategy. Suitable parentdomains are typically related or homologous and have particular assemblyspecificity. “Homologous” typically means two domains sharing at least35%, 40%, 45%, 50%, 55%, 60%, 62%, 65%, 68%, 70%, 75%, 80%, 85%, 90%,95% or 99% sequence identity. If parent domains are present in a commonsolution, they may tend to homodimerize rather than heterodimerize withone another. As used herein, “existing assembly domains” includewild-type or naturally-occurring sequences from organisms such as human,mouse, yeast, bacteria, to name but a few, as well as derivativesequences that have been modified from the wild-type sequences, such as,for example, sequences that have been stabilized; rendered lessimmunogenic; given altered, enhanced or diminished assembly specificity,altered enzymatic properties, altered solubility, or enhancedexpression; truncated; or fused to another polypeptide. “Existingassembly domains” can also be partially- or fully-synthetic sequencesthat are synthesized based on molecular design, in vitro or in vivoselection methods (e.g., yeast two-hybrid system, phage display), orcombinations thereof.

An “engineered domain” refers to a domain engineered from at least twononidentical parent domains. An engineered domain is also referred to asa daughter domain. Typically, an engineered domain of the presentinvention contains amino acid sequence segments derived from two or moreexisting homologous parent domains. Preferably, the interface of anengineered domain includes amino acids derived from more than one parentdomain. The presence of amino acids from different parent domainsconfers a assembly specificity distinct from the assembly specificitiesof the parent domains. For example, the presence of the amino acids fromdifferent parent domains promotes or enhances heterodimerization orheteromultimerization.

A Strand Exchange Engineered Domain (SEED) is an engineered domain thatis engineered from at least two nonidentical parent domains by thestrand exchange engineering method described in detail below.

As used herein, a “polypeptide” refers generally to any polypeptide orprotein having more than about ten amino acids. Preferably, mammalianpolypeptides (polypeptides that were originally derived from a mammalianorganism) are used for SEED engineering, more preferably those which aredirectly secreted into the medium. Examples of bacterial polypeptidesinclude, e.g., alkaline phosphatase and β-lactamase. Examples ofmammalian polypeptides include molecules such as renin, a growthhormone, including human growth hormone; bovine growth hormone; growthhormone releasing factor, parathyroid hormone; thyroid stimulatinghormone; lipoproteins; α-1-antitrypsin; insulin A-chain; insulinB-chain; proinsulin; follicle stimulating hormone; calcitonin;luteinizing hormone; glucagon; clotting factors such as factor VIIIC,factor IX, tissue factor, and von-Willebrands factor, anti-clottingfactors such as Protein C; atrial natriuretic factor lung surfactant; aplasminogen activator, such as urokinase or human urine or tissue-typeplasminogen activator (t-PA); bombesin; thrombin; hemopoietic growthfactor tumor necrosis factor-α and -β; enkephalinase; RANTES (regulatedon activation normally T-cell expressed and secreted); human macrophageinflammatory protein (MIP-1-α); a serum albumin such as human serumalbumin; Muellerian-inhibiting substance; relaxin A-chain; relaxinB-chain; prorelaxin; mouse gonadotropin-associated peptide; DNase;inhibin; activin; vascular endothelial growth factor (VEGF); receptorsfor hormones or growth factors; integrin; protein A or D; rheumatoidfactors; a neurotrophic factor such as bone-derived neurotrophic factor(BDNF), neurotrophin-3, -4, -5, or -6 (NT-3, NT-4, NT-5, or NT-6), or anerve growth factor such as NGF-beta; platelet-derived growth factor(PDGF); fibroblast growth factor such as AFGF and bFGF; epidermal growthfactor (EGF); transforming growth factor (TGF) such as TGF-α and TGF-β,including TGF-β1, TGF-β2, TGF-β3, TGF-β4, or TGF-β5; insulin-like growthfactor-I and -II (IGF-I and IGF-II); des(1-3)-IGF-I (brain IGF-I),insulin-like growth factor binding proteins; CD proteins such as CD-3,CD-4, CD-8, and CD-19; erythropoietin; osteoinductive factors;immunotoxins; a bone morphogenetic protein (BMP); an interferon such asinterferon-alpha, -beta, and -gamma; colony stimulating factors (CSFs),e.g., M-CSF, GM-CSF, and G-CSF; interleukins (ILs), e.g., IL-1 to IL-10;superoxide dismutase; T-cell receptors; surface membrane proteins; decayaccelerating factor; transport proteins; homing receptors; addressins;regulatory proteins; immunoglobulins (antibodies); and fragments of anyof the above-listed polypeptides.

As used herein, the “first polypeptide” or “first subunit” is anypolypeptide which is to be associated with a second polypeptide throughthe interaction between the engineered domains. The “second polypeptide”or “second subunit” is any polypeptide which is to be associated withthe first polypeptide through the interaction between the engineereddomains. In addition to the engineered domains, the first and/or thesecond polypeptide may include one or more additional bio-activedomains, such as, for example, an antibody variable domain, receptorbinding domain, ligand binding domain or enzymatic domain) or other“binding domains” such as antibody constant domains (or parts thereof)including CH3 and CH2 domains. As an example, the first polypeptide mayinclude at least one engineered domain of the invention, such as anengineered CH3 domain of an immunoglobulin and can form the interface ofthe first polypeptide. The first polypeptide may further include otherantibody heavy chain binding domains (e.g., CH1, CH2, or CH4), andadditional bio-active domains, such as receptor polypeptides (especiallythose which form dimers with another receptor polypeptide, e.g.,interleukin-8 receptor and integrin heterodimers, e.g., LFA-1 orGPIIIb/IIIa), ligand polypeptides (e.g., cytokines, nerve growth factor,neurotrophin-3, and brain-derived neurotrophic factor—see Arakawa et al.(1994) J. Biol. Chem. 269(45):27833-27839 and Radziejewski et al. (1993)Biochem. 32(48):1350) and antibody variable domain polypeptides (e.g.,diabodies and BsAbs).

As used herein, “assembly” refers to a protein-protein interaction thatoccurs during the production of a multisubunit protein. For example,during antibody production, the heavy and light chains are synthesizedfrom ribosomes associated with the endoplasmic reticulum. The individualchains then fold, and then assemble into mature antibodies throughproper association of heavy and light chains. For example, in the caseof IgG antibodies, the assembly of the Fab portion is initially drivenprimarily by interactions between the CH1 and CL domains, and also byinteractions between the VH and VL regions. In the case of the two heavychains, the initial assembly reaction is the association of the two CH3domains. These initial assembly reactions are usually, but not always,followed by disulfide bond formation between the assembled subunitpolypeptides. As used herein, “assembly” is distinct from “binding”;assembly refers to the protein interaction events that occur duringproduction of a mature protein, such as an antibody before it issecreted from a cell, while binding refers to protein interaction eventsthat occur after secretion, such as the interaction of an antibody withan antigen or with an Fc receptor. In an operational sense, assembly ofa therapeutic or diagnostic protein occurs during the preparation of thetherapeutic protein up to and including the placement of a product in avial, and binding of a therapeutic or diagnostic protein refers toevents that occur after a therapeutic protein is administered to apatient or when a diagnostic protein is used in a diagnostic test.

By “binding” is meant the interaction of a protein with a target proteinsubsequent to the synthesis and assembly of the protein.

Strand Exchange Engineering

The invention uses the fact that natural protein domains mediatingprotein-protein interactions are often homologous or, in the case ofhomodimers, identical, and that such proteins and domains often onlyhomodimerize with themselves but typically do not heterodimerize withother family members or do not heterodimerize with other family memberswith an affinity equal to or greater than their affinities forhomodimerization. According to the invention, such proteins may be usedto design heterodimeric or heteromultimeric proteins using strandexchange engineered methods described in detail below. Such engineereddomains are also referred to as “Strand Exchange Engineered Domains”(“SEEDs”). Multidomain proteins containing such engineered domains arealso referred to as strand exchange engineered proteins.

Strand exchange engineering typically begins with a structural model ofa dimeric parent protein domain. Two parent domains that can eachhomodimerize or dimerize with its own assembly partner but notheterodimerize with each other are structurally aligned. The parentdomains may dimerize in a face-to-face manner, i.e., the dimer partnersmay be related by a 180-degree rotational symmetry. The parent domainsmay also dimerize in a front-to-back manner.

Due to the geometry of rotational symmetry of homodimeric proteins,there is usually a line of amino acids in the interaction surface thatinteract in a homotypic manner. In other words, there are amino acidsthat interact with their counterparts in the other subunit. For example,in the CH3 domain of IgG1, these amino acids include L351, P352, T366,T394, P395, and Y407. This line of amino acids will generally beparallel to the axis of rotational symmetry of the dimer. In choosingparent domains, it is often useful to choose proteins that homodimerizesuch that the long axis of the dimerization interface is not stronglyparallel to the axis of rotational symmetry. For example, SEEDs based onleucine-zipper family members are difficult to construct, because thedimerization interface is parallel to the axis of symmetry, and many ofthe amino acid interactions are homotypic. Accordingly, in somepreferred embodiments, the engineered domains of the invention are notleucine-zipper domains. In contrast, the CH3 family domains areparticular useful because a significant portion of the interactionsurface lies outside the line of symmetry. It however will be recognizedby those skilled in the art that the line of symmetry (i.e., a line ofhomotypically interacting amino acids) may be an oversimplification. Forexample, the side-chains of amino acids on the line of symmetry maypoint toward the hydrophobic core of the domain.

A new dimerization interface is conceptually designed and divided intoat least two regions which typically lie on either side of the homotypicinteraction line (i.e., the line of symmetry). New domains are thendesigned by strand exchange wherein two daughter domain linear aminoacid sequences are constructed from two aligned parent domain amino acidsequences by taking complementary segments from each parent sequence. Asa result, in the regions of the dimerization interface, the two daughterdomains (i.e., two SEEDs) have complementary amino acid segments fromparent domains. This concept is illustrated in FIGS. 1A and 1B. As shownin FIG. 1A, two daughter SEED sequences, 1 and 2, are engineered fromtwo parent sequences, A and B, in an entirely complementary manner. IfDaughter 1 has an amino acid segment from Parent A at a given region ofthe interaction interface, Daughter 2 will have the corresponding aminoacid segment from Parent B. The interaction interface is designed suchthat at least one amino acid sequence segment on Daughter 1 interactswith an amino acid sequence segment on daughter 2 that derived from thesame parent domain. In FIG. 1B, the daughter SEED domains are derivedprimarily from one parent domain. However, the amino acids at thedimerization interface on either daughter SEED domain are derived fromeither one parent or another in a complementary manner.

It should be noted that FIG. 1A and FIG. 1B represent two extremeexamples of the invention, and that SEEDs may be engineered by methodsof the invention that have designs intermediate between FIG. 1A and FIG.1B. For example, as described in the Examples in more detail, it ispossible to construct a SEED based on parent domains from theimmunoglobulin CH3 domain family. The daughter SEEDs may be derivedprimarily in a complementary manner from IgG and IgA, but the aminoacids that interact with FcRn are derived from IgG to preserve theinteraction with FcRn.

Thus, SEEDs are typically engineered by combining two or more homologousparent domains. The parent domains are polypeptides that differ from oneanother by at least four amino acids. In making a SEED, the sequences ofthe original polypeptides are aligned based on their homologies,theoretical structural models, crystal or solution structures, or anycombinations thereof. There is at least one different amino acid at oneor more aligned sequence positions, or a different number of amino acidsin at least one pair of aligned original sequences. The parent sequencesare then divided into at least two segments including at least one aminoacid each. A SEED sequence may be composed by choosing, from among theoriginal sequences, the one desired for each divided segment. A SEEDwill often differ from each individual parent sequence by at least twoconsecutive amino acids, and sometimes by three, four or moreconsecutive amino acids. In addition to selecting sequences from theoriginal parent polypeptides, a SEED can contain any desired amino acidsat any positions, such as positions outside the designed interface, inorder to satisfy the other design needs.

There are positions on the sequence of the SEED where the parentsequence changes from one parent to a second parent. These positions arecalled exchange points or exchange positions. Exchange points orexchange positions can include one or more amino acids whose identitymay be shared by both parents. Typically, exchange points are chosenfrom the amino acids on or near the line of symmetry, although otherexchange points can also be chosen. Exchange points can also includeamino acids not shared by the parents. In this case, the sequenceabruptly switches from one parent to another. Furthermore, exchangepoints can include one or more novel amino acids not belonging to any ofthe parents. In this case, typically, different parent sequences appearon either side of the novel amino acids. If there are multiple exchangepoints in the sequence of a SEED, the total number of parent segmentscan be greater than two, up to a number one greater than the number ofexchange points. These parent segments can be selected from distinctparent domains. Thus, the present invention contemplates SEEDs that areengineered from more than two parent domains.

For purposes of convenience, each SEED is typically named according tothe order of its parent sequences, beginning with the N-terminus of theSEED. In the examples given below, an AG SEED has an IgA1 sequencesegment on the N-terminal end, which then changes to an IgG1 sequencesegment at the first exchange point. A GA SEED has an IgG1 sequencesegment on the N-terminal end, which then changes to an IgA1 sequence atthe first exchange point.

Thus, the interaction interface of the SEEDs of the invention includesamino acid sequence segments derived from two or more parent domains. Asa result, the interface of the SEEDs has interaction properties distinctfrom interaction properties of the parent domains. In particular, thepresence of amino acids from different parent domains confers anassembly specificity distinct from the assembly specificity of either ofthe parent domains. For example, the specificity of heterodimerizationor heteromultimerization is enhanced by the presence of amino acids fromdifferent parent domains on the interface of a SEED. As a result, a pairof SEEDs form heterodimers with one another preferentially over forminghomodimers. Thus, when a pair of SEEDs are expressed in an expressionsystem, heterodimers of the SEEDs can specifically assemble such thatthe heterodimeric SEEDs can be directly recovered from the cell culturesystem without the need for elaborate separation steps to remove thehomodimers.

CH3-Based SEEDs

Backbone homology and differences between the dimerization interfaces ofthe parent domains are important for creating SEEDs. Thus, according toone embodiment of the invention, the classes of immunoglobulin proteinsare a useful source for parent domains. SEEDS can be created by usingparental sequences from two different immunoglobulin classes. Forexample, SEEDs can be engineered from CH3 family domains by the methodof the invention. CH3 family domains suitable for designing SEEDsinclude, but are not limited to, CH3 domains of IgG1, IgG2, IgG3, IgG4,IgA, and IgD, and the CH4 domains of IgE and IgM.

CH3 domains of human IgG1 and IgA form homodimers but do not formheterodimers with each other. Therefore, pairs of SEEDs (e.g., an AGSEED and a GA SEED) can be engineered from IgG1 and IgA CH3 domains suchthat they can heterodimerize with each other but their ability tohomodimerize is minimal. According to one embodiment, the assemblyinterface on the CH3 domain is divided into two regions, which lie oneither side of the line of homotypic interactions. Homotypicinteractions for the IgA and IgG1 CH3 domains can be determined byobservation and probing the crystal structure with a 1.4 Å sphere todetermine whether or not the two side chains are close enough to excludewater. If the surfaces joined together across the interface, thisimplies that the side chains are closely interacting. For example, inthe wild type CH3 domain of IgG1, the homotypically interacting aminoacids include, but are not limited to, L351, P352, T366, T394, P395, andY407. For the wild type CH3 domain of IgA1, the homotypicallyinteracting amino acids include, but are not limited to, L352, P353,T368, W398, A399 and T414. In one exemplary SEED subunit, those aminoacids with outwardly-pointing side-chains that lie to the left of theline of homotypic interaction are taken from the CH3 of IgG1, and thosewith outwardly-pointing side-chains to the right of the line ofhomotypic interaction are taken from the CH3 of IgA. In the other SEEDsubunit, those amino acids with outwardly-pointing side-chains that lieto the left of the line of homotypic interaction are taken from the CH3of IgA, and those with outwardly-pointing side-chains to the right ofthe line of homotypic interaction are taken from the CH3 of IgG1. Thechoice of amino acids along the line of homotypic interaction is basedon structural considerations and performed on a case-by-case basis,although it is likely that the amino acids from either parent domain canbe selected for a particular region of a SEED.

For example, a CH3-based AG SEED may have a polypeptide sequence asshown in SEQ ID NO:1, wherein X₁, X₂, or X₃ may be any amino acids. Insome embodiments, X₁ is K or S, X₂ is V or T, and X₃ is T or S.Preferably, X₁ is S, X₂ is V or T, and X₃ is S. A CH3-based GA SEED mayhave a polypeptide sequence as shown in SEQ ID NO:2, wherein X₁, X₂, X₃,X₄, X₅, or X₆ may be any amino acids. In some embodiments, X₁ is L or Q,X₂ is A or T, X₃ is L, V, D, or T, X₄ is F, A, D, E, G, H, K, N, P, Q,R, S, or T, X₅ is A or T, and X₆ is E or D. Preferably, X₁ is Q, X₂ is Aor T, X₃ is L, V, D, or T, X₄ is F, A, D, E, G, H, K, N, P, Q, R, S, orT, X₅ is T, and X₆ is D. Exemplary SEED heterodimers may include oneSEED subunit selected from AG(f0) SEED (SEQ ID NO:3), AG(f1) SEED (SEQID NO:4), or AG(f2) SEED (SEQ ID NO:5), and the other SEED subunitselected from GA(f0) SEED (SEQ ID NO:6), GA(f1) SEED (SEQ ID NO:7),GA(f2) SEED (SEQ ID NO:8), or GA(f3) SEED (SEQ ID NO:9). For example, aSEED heterodimer may include AG(f0) SEED (SEQ ID NO:3) and GA(f0) SEED(SEQ ID NO:6) subunits. In another example, a SEED heterodimer mayinclude AG(f2) SEED (SEQ ID NO:5) and GA(f2) SEED (SEQ ID NO:8)subunits. In yet another embodiment, a SEED heterodimer may includeAG(s0) SEED (SEQ ID NO: 10) and GA(s0) SEED (SEQ ID NO: 11) subunits.

Bio-Active Domains

The SEEDs according to this invention are particularly useful whencoupled with a fusion partner. A fusion partner (X) can be fused to theN-terminus of the SEED (X-SEED), it can also be fused to the C-terminusof the SEED (SEED-X). In addition, a fusion partner can be fused to theN-terminus and the C-terminus of the SEED at the same time (X-SEED-X).Two different fusion partners can be fused to a SEED (X-SEED-Y).

Given that two SEED sequences typically form heterodimers, it ispossible that at least one, two, three, or four fusion partners can becontemplated in the SEED heterodimer. For example, according to oneembodiment, the first daughter SEED has one fusion partner, and thesecond daughter SEED has no fusion partner, resulting in the followingexemplary configurations: SEED-X heterodimerized to SEED; or X-SEEDheterodimerized to SEED. In a further example, the first daughter SEEDhas two different fusion partners (X, Y), and the second daughter SEEDhas two different fusion partners (W, Z) differing from the fusionpartners of the first daughter SEED. Possible exemplary configurationsincludes, but are not limited to: X-SEED-Y heterodimerized to W-SEED-Z;X-SEED-Y heterodimerized to Z-SEED-W; Y-SEED-X heterodimerized toW-SEED-Z; or Y-SEED-X heterodimerized to Z-SEED-W. According to theinvention, a SEED can also have two or more fusion partners (X) fusedsequentially to, for example, the N-terminus (X-X-SEED). Alternately, inanother embodiment of the invention, the first daughter SEED has onefusion partner (X), and the second daughter SEED has one fusion partner(Y), resulting in the following exemplary configurations: X-SEEDheterodimerized to Y-SEED; X-SEED heterodimerized to SEED-Y; or SEED-Xheterodimerized to SEED-Y. In yet another embodiment of the invention,the first daughter SEED has one fusion partner (X), and the seconddaughter SEED has two fusion partners (Z, Y). Possible exemplaryconfigurations include, but are not limited to: X-SEED heterodimerizedto Y-SEED-Z; X-SEED heterodimerized to Z-SEED-Y; SEED-X heterodimerizedto Z-SEED-Y; or SEED-X heterodimerized to Y-SEED-Z. Exemplaryconfigurations are illustrated in FIG. 1C.

In particular, a fusion partner can be one or more bio-active domainsincluding any biologically active protein or a biologically activeportion thereof. For example, a bio-active domain can include anantibody constant or variable region, including, but not limited to, aVL domain, a VH domain, an Fv, a single-chain Fv, a diabody, an Fabfragment, a single-chain Fab, or an F(ab′)₂.

According to the invention, the fusion partners can be coupled to theSEED moieties directly or indirectly. For example, a fusion partner maybe linked to a SEED moiety by a peptide linker, such as described inU.S. Pat. No. 5,258,498 and U.S. Pat. No. 5,482,858 to Huston et al., orU.S. Pat. No. 5,856,456 and U.S. Pat. No. 5,990,275 to Whitlow et al.,the teaching of which are hereby incorporated by reference. Typically, asuitable peptide linker may contain glycine and serine residues.Typically, a suitable peptide linker may also have different properties.For example, in some embodiments, a linker may further include aprotease cleavage site, such as a matrix metalloproteinase recognitionsite.

Thus, the present invention provides a novel method to producemultispecific antibodies based on SEED technology. A multispecificantibody is a molecule having binding specificities for at least twodifferent antigens. While such molecules typically will only bind twoantigens (i.e. BsAbs), antibodies with additional specificities such astrispecific or tetraspecific antibodies are encompassed by thisexpression when used herein. Examples of BsAbs include those that bindto different antigens on the same cell surface, or those that bind to acell surface antigen and a non-cell surface antigen. A non-cell surfaceantigen includes, but is not limited to, an extracellular orintracellular antigen, a soluble or insoluble antigen. The multispecificantibodies may bind to different antigens simultaneously, althoughsimultaneous binding is not required for the function of themultispecific antibodies. In some applications, the antigens arepreferentially functionally related, such as EGFR and HER2. Particularlyuseful types of multispecific antibodies include, but are not limitedto, anti-EGFR/anti-HER2; anti-EGFR/anti-HER2/anti-HER3;anti-EGFR/anti-HER3; anti-EGFR/anti-HER2/anti-IGF1R;anti-EGFR/anti-HER2/anti-HER3/anti-IGF1R;anti-EGFR/anti-HER3/anti-IGF1R; anti-EGFR/anti-IGF1R; andanti-HER2/anti-IGF1R. Other combinations of specificities involving theEGFR, HER family and IGF1R are within the scope of the presentinvention.

Further examples of BsAbs include those with one arm directed against atumor cell antigen and the other arm directed against a cytotoxictrigger molecule such as anti-FcγRI/anti-CD15, anti-p185^(HER2)/FcγRIII(CD16), anti-CD3/anti-malignant B-cell (1D10),anti-CD3/anti-p185^(HER2), anti-CD3/anti-p97, anti-CD3/anti-renal cellcarcinoma, anti-CD3/anti-OVCAR-3, anti-CD3/L-D1 (anti-colon carcinoma),anti-CD3/anti-melanocyte stimulating hormone analog, anti-EGFreceptor/anti-CD3, anti-CD3/anti-CAMA1, anti-CD3/anti-CD19,anti-CD3/MoV18, anti-neural cell adhesion molecule (NCAM)/anti-CD3,anti-folate binding protein (FBP)/anti-CD3, anti-pan carcinomaassociated antigen (AMOC-31)/anti-CD3; BsAbs with one arm which bindsspecifically to a tumor antigen and one arm which binds to a toxin suchas anti-saporin/anti-Id-1, anti-CD22/anti-saporin,anti-CD7/anti-saporin, anti-CD38/anti-saporin, anti-CEA/anti-ricin Achain, anti-interferon-α(IFN-α)/anti-hybridoma idiotype,anti-CEA/anti-vinca alkaloid; BsAbs for converting enzyme activatedprodrugs such as anti-CD30/anti-alkaline phosphatase (which catalyzesconversion of mitomycin phosphate prodrug to mitomycin alcohol); BsAbswhich can be used as fibrinolytic agents such as anti-fibrin/anti-tissueplasminogen activator (tPA), anti-fibrin/anti-urokinase-type plasminogenactivator (uPA); BsAbs for targeting immune complexes to cell surfacereceptors such as anti-low density lipoprotein (LDL)/anti-Fc receptor(e.g., FcγRI, FcγRII or FcγRIII); BsAbs for use in therapy of infectiousdiseases such as anti-CD3/anti-herpes simplex virus (HSV), anti-T-cellreceptor:CD3 complex/anti-influenza, anti-FcγR/anti-HIV; BsAbs for tumordetection in vitro or in vivo such as anti-CEA/anti-EOTUBE,anti-CEA/anti-DPTA, anti-p185^(HER2)/anti-hapten; BsAbs as vaccineadjuvants; and BsAbs as diagnostic tools such as anti-rabbitIgG/anti-ferritin, anti-horse radish peroxidase (HRP)/anti-hormone,anti-somatostatin/anti-substance P, anti-HRP/anti-FITC,anti-CEA/anti-β-galactosidase. Examples of trispecific antibodiesinclude anti-CD3/anti-CD4/anti-CD37, anti-CD3/anti-CD5/anti-CD37 andanti-CD3/anti-CD8/anti-CD37.

According to the invention, other bio-active domains include hormones,cytokines, chemokines, secreted enzymes, ligands, extracellular portionsof trans-membrane receptors, or receptors. Hormones include, but are notlimited to, growth hormones, or glucagon-like peptide (GLP-1). Cytokinesinclude, but are not limited to, interleukin-2 (IL-2), IL-4, IL-5, IL-6,IL-7, IL-10, IL-12, IL-13, IL-14, IL-15, IL-16, IL-18, IL-21, IL-23,IL-31; hematopoeitic factors such as granulocyte-macrophage colonystimulating factor (GM-CSF), G-SCF and erythropoietin; tumor necrosisfactors such as TNF-α; lymphokines such as lymphotoxin; regulators ofmetabolic processes such as leptin; and interferons (IFN) such as IFN-α,IFN-β, and IFN-γ.

Thus, the engineered heteromeric proteins of the present inventionpermit the colocalization of different bio-active domains in abiological system. This can be accomplished, for example, in the contextof a multimeric protein incorporating two different antibody variabledomains, where one antibody variable domain is fused to one engineereddomain and a second antibody variable domain is fused to a secondengineered domain that preferentially assembles with the firstengineered domain. Administration of such an engineered protein causestwo distinct activities—in this case, binding activities—to be presentin the same molecule in the biological system, colocalizing theactivities within the biological system. Whether the activities involvebinding to other molecules (as an antibody variable domain/antigeninteraction, a ligand/receptor interaction, etc.), enzymatic activities,or a combination thereof, the present invention provides a system torequire that the activities be present at the same place permitting, forexample, the targeting of a therapeutic activity to a particular celltype or location; the crosslinking of different receptors or cells; thecolocalization of an antigen and adjuvant; etc. This can be accomplishedby direct administration of an engineered heteromeric protein to abiological system or by expression of nucleic acid encoding the subunitswithin the biological system. Nucleic acid expression permits theengineering of additional levels of control in the system. For example,the expression of each subunit can be differentially regulated, suchthat the complete heteromeric protein and the resulting colocalizationof activities occurs only upon the occurrence of all conditions requiredfor expression of each subunit.

Engineered Domains with Reduced Immunogenicity

In another embodiment of the invention, the SEED sequences can bemodified to reduce their potential immunogenicity. Because SEEDpolypeptides are hybrids between two different naturally occurring humansequences, they include sequence segments at their junctions that arenot found in natural human proteins. In an organism, these sequencesegments may be processed into non-self T-cell epitopes.

Methods to analyze peptide sequences for their potential to createT-cell epitopes are well known in the art. For example, ProPred(http://www.imtech.res.in/raghava/propred; Singh and Raghava (2001)Bioinformatics 17:1236-1237) is a publically available web-based toolthat can be used for the prediction of peptides that bind HLA-DRalleles. ProPred is based on a matrix prediction algorithm described bySturniolo for a set of 50 HLA-DR alleles (Sturniolo et al., (1999)Nature Biotechnol. 17:555-561). Using such an algorithm, various peptidesequences were discovered within AG SEED and GA SEED polypeptidesequences which are predicted to bind to multiple MHC class II alleleswith significant binding strength and are therefore potentiallyimmunogenic.

For example, in one embodiment, the AG SEED and GA SEED sequences aremodified to remove one or more T-cell epitopes present in the SEEDsequence. This modification may include substitution, deletion, ormodification of one or more amino acid residues in order to remove theT-cell epitope. Table 1 presents a list of peptide sequences that arepotential T-cell epitopes in the AG SEED and GA SEED, and possible aminoacid substitutions that are predicted to reduce or remove the T-cellepitope.

TABLE 1 AG(f0) SEED Pos Peptide Amino Acid Substitution 32FYPKDIAVE (SEQ ID NO: 12) K35S 67 FAVTSKLTV (SEQ ID NO: 13) V75T 69VTSKLTVDK (SEQ ID NO: 14) 99 YTQKTISLS (SEQ ID NO: 15) T103S 18LALNELVTL (SEQ ID NO: 16) L23Q 20 LNELVTLTC (SEQ ID NO: 17) 23LVTLTCLVK (SEQ ID NO: 18) 54 YLTWAPVLD (SEQ ID NO: 19) A58T 55LTWAPVLDS (SEQ ID NO: 20) L61V,D,T 61 LDSDGSFFL (SEQ ID NO: 21) L6IV,D,T67 FFLYSILRV (SEQ ID NO: 22) F67A,D,E,G,H,K,N,P,Q,R,S,T 68FLYSILRVA (SEQ ID NO: 23) A76T 69 LYSILRVAA (SEQ ID NO: 24) E78D 70YSILRVAAE (SEQ ID NO: 25) 72 ILRVAAEDW (SEQ ID NO: 26) Table 1 showspeptides in AG(f0) SEED or GA(f0) SEED which are predicted to bind toHLA-DR alleles and are potential T-cell epitopes, and amino acidsubstitutions at specific residues (indicated in bold) within thepeptides that are predicted to reduce the binding to HLA,DR alleles.″Pos″ indicates the position of the peptide within the sequence. Thenumbering of the amino acids is sequential and relative to the firstamino acid of the SEED molecule.

The original “full” AG SEED (AG(f0) SEED (SEQ ID NO:3)) and GA SEED(GA(f0) SEED (SEQ ID NO:6)) polypeptides, and some exemplary variantpolypeptides, including AG(f1) SEED (SEQ ID NO:4), AG(f2) SEED (SEQ IDNO:5), GA(f1) SEED (SEQ ID NO:7), GA(f2) SEED (SEQ ID NO:8), and GA(f3)SEED (SEQ ID NO:9) are shown in the following alignments.

Alignment of AG SEEDs (dot indicates residue identity)  1GQPFRPEVHLLPPSREEMTKNQVSLTCLARGFYPKDIAVEWESNGQPENNYKTTPSRQEPAG (f0) SEED  1..................................S.........................AG (f1) SEED  1..................................S.........................AG (f2) SEED 61 SQGTTTFAVTSKLTVDKSRWQQGNVFSCSVMHEALHNHYTQKTISLAG (f0) SEED 61 ..........................................S...AG (f1) SEED 61 ..............T...........................S...AG (f2) SEED Alignment of GA SEEDs (dot indicates residue identity)  1GQPREPQVYTLPPPSEELALNELVTLTCLVKGFYPSDIAVEWLQGSQELPREKYLTWAPVGA (f0) SEED  1......................Q..................................T..GA (f1) SEED  1......................Q.....................................GA (f2) SEED  1......................Q..................................T..GA (f3) SEED 61 LDSDGSFFLYSILRVAAEDWKKGDTFSCSVMHEALHNHYTQKSLDRGA (f0) SEED 61 V..............T.D............................GA (f1) SEED 61 D.....H........T.D............................GA (f2) SEED 61 T.....D........T.D............................GA (f3) SEED

Further exemplary embodiments according to the invention are detailed inthe examples that follow.

EXAMPLES Example 1: Identifying Homologous Structures to Become Parentsof a SEED

In this set of examples, the goal is to produce two distinct CH3-homologSEEDs that will form dimers that favor the formation of a heterodimerover formation of the two possible homodimers, thus resulting in apredominance of CH3-homolog heterodimers. The first task is to identifytwo or more CH3 domains that may produce this result when they are usedas parents of a pair of SEEDs. The CH3 homodimer forms a dimerizationinterface between β-sheets. It is important to find two CH3 domains thathave significant differences in this interface, in order to make aneffective pair of SEEDs that will preferentially heterodimerize.

CH3 domains from IgG are structurally highly conserved across the animalkingdom, containing a classic immunoglobulin domain β-sandwich fold.While there are significant differences between species in theidentities of the amino acids found on the outer surface, thedimerization interface surface that is buried upon dimerization ismostly conserved.

Each different class of immunoglobulin has its own Fc, and in particularhas its own equivalent of the IgG CH3 sequence and structure.Examination of the CH3 domain in the crystal structure of the Fc portionof a human IgA1 (PDB number 1OW0, resolution 3.1 Å) revealed that theoverall fold was homologous to human IgG CH3. The backbone RMSD (rootmean square deviation) of the alignment of single CH3 domains from theIgA Fc 1OW0 and from the IgG Fc 1L6X, excluding turns where alignmenthad different lengths, was about 0.99 Å. (See table 2). However, the CH3homodimer interface of IgA1 is significantly different than that ofIgG1. Thus, two SEEDs made from the CH3 of human IgA1 and the CH3 ofhuman IgG1 each contain some portion of the interface from IgA1, andsome from IgG1, and are designed to not dimerize with themselves, norwith either parent CH3, but to dimerize preferentially with the othercomplementary SEED.

TABLE 2 Structural alignment of CH3 domains from IgG and IgA* Human IgGHuman IgA Q342-M358 N343-L359 N361-P387 E363-E389 N390-S400 K394-P404G402-L443 T409-R450 *Portions of IgG and IgA sequences were used tooverlay structures and determine backbone RMSD. The program InsightII(Accelrys, San Diego, CA) superimposed the backbone atoms of theresidues included in the structurally homologous sequences listed above.The RMSD between the two backbones (within the ranges in the table) was0.99 Å.

For example, the CH3 domain from human IgA1 and the CH3 domain fromhuman IgG1 were used as parental polypeptides. For structural alignmentand modeling, IgG1 PDB entries 1DN2 (resolution 2.7 Å) and 1L6X (CH3sequence highly homologous to 1DN2, with two minor differences,resolution 1.65 Å), and IgA1 PDB entry 1OW0 (resolution 3.1 Å) wereused. FIG. 2 shows the structural alignment of the two sequences. TheCH3 domain of IgG1 is numbered according to Kabat EU Index (Kabat etal., (1991) Sequences of Proteins of Immunological Interest, 5^(th)Edition, NIH Publication 91-3242), while the CH3 domain of IgA issequentially numbered as in the PDB structure 1OW0. Bold lettersdesignate the backbone positions that were included in the alignmentdescribed in Table 2, which were further used to design junctionalcrossover points in designing SEEDs.

Example 2: Choice of Exchange Points

Once the structural alignment is determined and the interface residuesidentified, the exchange points are ready to be chosen for creating theSEEDs. The CH3 homodimer has 180° rotational symmetry around an axisthat runs between the domains approximately perpendicular to the betastrands (FIG. 4). Each domain has the N-terminus and C-terminus onopposite sides of the axis of symmetry. Therefore, CH3 domains dimerizein a hand-shaking manner, where only in a line down the center of theinterface along the symmetry axis do residues on one side contact thesame residue in the other partner. Residues on either side of that linecontact the partner domain in opposite fashion: e.g., residues on theN-terminal side of the first domain make contact with residues in thesecond domain that are on the C-terminal side, and vice versa.

In one embodiment, a CH3-based SEED is designed to break the symmetry,making the two sides different. For example, strand exchange will makeone side of the dimer more like IgA1, and the other side more like IgG.This approach creates two different CH3-based SEEDs that areapproximately complementary in their use of IgG and IgA-derived aminoacids. As shown in FIGS. 3A and 3B, the linear polypeptide sequence runsback and forth between IgG and IgA sequences in order to make onephysical side of the dimeric structure IgA-like and the other sideIgG-like. Thus, each final SEED sequence contains multiple exchangepoints, at each of which the linear sequence changes from IgA to IgG orfrom IgG to IgA (FIGS. 3A and 3B).

In general, there are many potential multiple exchange points in thepolypeptide sequence that can be chosen to alternate between IgA andIgG1 sequences. An important consideration is that the final structureshould have good structural characteristics (e.g., stability, folding,expression, homology to the original). This can be achieved byinspection, simple modeling, extensive calculation, trial and error,selection, or by other means. In the specific embodiment described here,the sequence homology between the CH3 domains of IgA and IgG1 was usedto decide the exchange points. Alignment of the crystal structures ofthe IgG1 and IgA CH3 revealed approximately parallel lines of aminoacids along an approximate plane angled across the middle of the domain.The residues on the plane were identical in both CH3 classes in all buttwo strands in the IgG1/IgA structural alignment. Furthermore, thestructure alignment generally showed the side chains of those aminoacids in the same rotamer orientations, particularly in the hydrophobiccore. It was therefore hypothesized that these residues could be used asexchange points, and the residues on one or the other side could bealtered without disrupting the overall structure. FIGS. 3A and 3B showthe sequence alignment with the exchange points highlighted in boldletters. FIGS. 5 and 6A-C show the molecular structure illustrating the3-dimensional locations of the exchange points.

In the two cases where the residues are not the same at a junctionregion, the choices of exchange points were based on structuralconsiderations. In one instance, Pro395 and Pro396 in IgG1 correspondstructurally to Ala399 and Ser400 in IgA1. The division was made betweenthese two residues. The other location is near the C-terminus, Leu441and Ser442 in IgG1 correspond structurally to Ile448 and Asp449 in IgA1.Again the division was made between these residues.

Protein-protein interactions are mediated by the complementarity of thetwo interacting surfaces. The dominant factor for the interaction is thecomposition and shape of those surfaces. Since the underlying backbonestructures and hydrophobic interiors of the CH3 domains of IgA and IgG1are similar, it was contemplated according to the principles of theinvention that only the surface would have to be altered, while the restof the domain could contain IgG sequences. In this case, the exchangepoints were designed on the strands that form the interface and werevery close to one another, allowing only the residues critical fordimerization to be exchanged. Thus, as an alternative, it is possiblethat the rest of the structure could help stabilize the assembly domain,and so CH3 SEEDs with a single exchange point in each of the sevenstrands could have advantages.

Therefore, two types of SEEDs can be designed and designated as “Full”for the SEEDs in which most or all of the residues in the domain wereinvolved in the strand exchange (corresponding to FIG. 1A) or “Surface”for the SEEDs in which the only altered residues are at the CH3dimerization interface (corresponding to FIG. 1B).

Based on this Example, it will be appreciated by those skilled in theart that a variety of strategies can be used to generate SEEDs based onimmunoglobulin superfamily constant domains.

Example 3: Designing the Sequences of the “Full” AG and GA SEEDs

As an example, the simplest way to make a “Full” SEED would be to usepure IgA sequence on the first side of the exchange point, and pure IgG1sequence on the second side of the exchange point. If the exchange pointis properly chosen, this would result in a SEED that should foldproperly and would have an IgA1-like dimerizing surface on one side(e.g., approximately half) of the domain, and an IgG-like dimerizingsurface on the other side. A ‘mirror image’ SEED can be made similarly,in which the first side is composed of IgG1 sequence and the second sideis composed of IgA sequence. When these two SEEDs are expressedtogether, they will preferentially form heterodimers because only in theheterodimer will each surface be contacting a surface on the otherdomain that matches its class: that is, the first half of the firstSEED, which is IgA1-like, will contact the second half of the secondSEED, which is also IgA1-like, while the second half of the first SEED,which is IgG1-like, will contact the first half of the second SEED,which is also IgG1-like. Since both sides of the contact surface arehighly complementary, the association should be strong. On the otherhand, when either SEED attempts to form a homodimer, each half of thedimerization surface will contact a surface on the partner SEED thatcomes from a different class: that is, the first half of one SEED, whichis IgA-like, will contact the second half of the partner domain, whichis IgG-like; and the second half of the first SEED, which is IgG1-like,will contact the first half of the partner domain, which is IgA-like.Since these surfaces are not highly complementary, their affinity willbe diminished, resulting in thermodynamics favoring the formation offewer homodimers and more heterodimers.

In this example, the CH3 is the only part of the Fc or antibody that wasaltered. The rest of the Fc or immunoglobulin is from human IgG1.Altering the amino acid sequence where the CH3 contacts or interactswith CH2 could potentially create problems with the interface betweenthe CH3 SEEDs and the IgG1 CH2 domains. In addition, this interfacecontains the binding site for FcRn, which confers important propertiesto the Fc that are desirable to retain. Therefore, structuralinformation (Martin et al. (2001) Molec. Cell 7:867) was used toidentify the CH3 residues involved in the interactions between CH3 andCH2, and between Fc and FcRn. Human IgG1 sequences were used for thoseresidues in all SEEDs. Molecular modeling was also used to help choosethe neighboring residues to avoid altering the structure of the FcRninteraction surface. The portion of CH3 that interacts with CH2 and withFcRn is not part of the dimerization interface, therefore, thesealterations were unlikely to hinder the formation of heterodimers.

FIG. 3B has the “Full” SEED sequences aligned with the IgG1 and IgAsequences in structural alignment. Residues that reside at the exchangepoints are highlighted in bold. Residues that were unaltered due totheir importance in maintaining the interaction with C_(H)2 and/or withFcRn are underlined.

Example 4: Construction of Heterodimeric Fc and Antibody MoleculesContaining CH3-Based SEEDs

The following general approach was used to make HuFc and HuFc-IL2constructs, as well as antibody and antibody-IL2 constructs, containingCH3 SEED domains in place of IgG1 CH3 domains. The CH3 domain of IgG1 isalmost entirely contained in an approximately 0.4 kb Ngo MIV/Sma Igenomic DNA fragment, which is present in pdCs or pdHL expressionplasmids that express the constant region of an IgG1 heavy chain.Exemplary expression plasmids are, for example, pdCs-huFc-IL2 (see, forexample, Lo et al., Protein Engineering [1998] 11:495), or pdHL7-KS-IL2(see, for example U.S. Pat. No. 6,696,517). The Ngo MIV site lies withinthe intron sequence immediately 5′ of the exon encoding IgG1 CH3, andthe Sma I site lies in a sequence encoding Ser₄₄₄Pro₄₄₅Gly₄₄₆ near theC-terminus of IgG1 (Kabat EU Index). An exemplary DNA sequence of amature human IgG1 Fc expressed from a pdCs vector is shown in SEQ IDNO:27. Replacement of the parental Ngo MIV/Sma I fragment with a NgoMIV/Sma I fragment encoding a CH3 SEED of the invention generates uponexpression a polypeptide containing a constant region with a CH3 SEED.

DNA sequence in pdCs encoding mature human IgG1Fc including terminal Lysine residue SEQ ID NO: 27GAGCCCAAATCTTCTGACAAAACTCACACATGCCCACCGTGCCCAGGTAAGCCAGCCCAGGCCTCGCCCTCCAGCTCAAGGCGGGACAGGTGCCCTAGAGTAGCCTGCATCCAGGGACAGGCCCCAGCCGGGTGCTGACACGTCCACCTCCATCTCTTCCTCAGCACCTGAACTCCTGGGGGGACCGTCAGTCTTCCTCTTCCCCCCAAAACCCAAGGACACCCTCATGATCTCCCGGACCCCTGAGGTCACATGCGTGGTGGTGGACGTGAGCCACGAAGACCCTGAGGTCAAGTTCAACTGGTACGTGGACGGCGTGGAGGTGCATAATGCCAAGACAAAGCCGCGGGAGGAGCAGTACAACAGCACGTACCGTGTGGTCAGCGTCCTCACCGTCCTGCACCAGGACTGGCTGAATGGCAAGGAGTAGAAGTGCAAGGTCTCCAACAAAGCCCTCCCAGCCCCCATCGAGAAAACCATCTCCAAAGCCAAAGGTGGGACCCGTGGGGTGCGAGGGCCACATGGACAGAGGCCGGCTCGGCCCACCCTCTGCCCTGAGAGTGACCGCTGTACCAACCTCTGTCCCTACAGGGCAGCCCCGAGAACCACAGGTGTACACCCTGCCCCCATCACGGGAGGAGATGACCAAGAACCAGGTCAGCCTGACCTGCCTGGTCAAAGGCTTCTATCCCAGCGACATCGCCGTGGAGTGGGAGAGCAATGGGCAGCCGGAGAACAACTACAAGACCACGCCTCCCGTGCTGGACTCCGACGGCTCCTTCTTCCTCTATAGCAAGCTCACCGTGGACAAGAGCAGGTGGCAGCAGGGGAACGTCTTCTCATGCTCCGTGATGCATGAGGCTCTGCACAACCACTACACGCAGAAGAGCCTCTCCCTGT CCCCGGGTAAATGA

Standard techniques were used to obtain DNA sequences encoding the CH3SEEDs of the invention. For example, DNA molecules with followingsequences as shown in SEQ ID NO:28, SEQ ID NO:29, SEQ ID NO:30, SEQ IDNO:31, SEQ ID NO:32, and SEQ ID NO:53 were synthesized de novo andpropagated in a pUC-derived carrier plasmid (Blue Heron Biotechnology,Bothell, Wash.).

DNA fragment Ngo MIV/SmaI, containing sequenceencoding AG(f0) SEED (underlined corresponding toFIG. 3B, ″Full AG SEED″): SEQ ID NO: 28gccggctcggcccaccctctgccctgagagtgaccgctgtaccaacctctgtccctacaGGGCAGCCCTTCCGGCCAGAGGTCCACCTGCTGCCCCCATCACGGGAGGAGATGACCAAGAACCAGGTCAGCCTGACCTGCCTGGCACGCGGCTTCTATCCCAAGGACATCGCCGTGGAGTGGGAGAGCAATGGGCAGCCGGAGAACAACTACAAGACCACGCCTTCCCGGCAGGAGCCCAGCCAGGGCACCACCACCTTCGCTGTGACCTCGAAGCTCACCGTGGACAAGAGCAGATGGCAGCAGGGGAACGTCTTCTCATGCTCCGTGATGCATGAGGCTCTGCACAACCACTACACGCAGAAGACCATCTCCCTGtccccgggDNA fragment Ngo MIW/Sma I, containing sequenceencoding AG(s0) SEED (underlined corresponding toFIG. 3A, ″Surface AG SEED″): SEQ ID NO: 29gccggctcggcccaccctctgccctgagagtgaccgctgtaccaacctctgtccctacaGGGCAGCCCTTCGAACCAGAGGTCCACACCCTGCCCCCATCACGGGAGGAGATGACCAAGAACCAGGTCAGCCTGACCTGCCTGGTCCGCGGCTTCTATCCCAGCGACATCGCCGTGGAGTGGGAGAGCAATGGGCAGCCGGAGAACAACTACAAGACCACGCCTTCCCGGCTGGAGCCCAGCCAGGGCACCACCACCTTCGCTGTGACCTCGAAGCTCACCGTGGACAAGAGCAGATGGCAGCAGGGGAACGTCTTCTCATGCTCCGTGATGCATGAGGCTCTGCACAACCACTACACGCAGAAGAGCCTCTCCCTGtccccgggDNA fragment Ngo MIW/Sma I, containing sequencecoding GA(f0) SEED (underlined corresponding toFIG. 3B, ″Full GA SEED″): SEQ ID NO: 30gccggctcggcccaccctctgccctgagagtgaccgctgtaccaacctctgtccctacaGGGCAGCCCCGAGAACCACAGGTGTACACCCTGCCCCCACCGTCGGAGGAGCTGGCCCTGAACGAGCTGGTGACGCTGACCTGCCTGGTCAAAGGCTTCTATCCCAGCGACATCGCCGTGGAGTGGCTGCAGGGGTCCCAGGAGCTGCCCCGCGAGAAGTACCTGACTTGGGCACCCGTGCTGGACTCCGACGGCTCCTTCTTCCTCTATAGTATACTGCGCGTGGCAGCCGAGGACTGGAAGAAGGGGGACACCTTCTCATGCTCCGTGATGCATGAGGCTCTGCACAACCACTACACGCAGAAGAGCCTCGACCGCtccccgggDNA fragment Ngo MIV/Sma I, containing sequenceencoding GA(s0) SEED (underlined corresponding toFIG. 3A, ″Surface GA SEED″): SEQ ID NO: 31gccggctcggcccaccctctgccctgagagtgaccgctgtaccaacctctgtccctacaGGGCAGCCCCGAGAACCACAGGTGTACACCCTGCCCCCACCGTCGGAGGAGCTGGCCCTGAACAACCAGGTGACGCTGACCTGCCTGGTCAAAGGCTTCTATCCCAGCGACATCGCCGTGGAGTGGGAGAGCAATGGGCAGCCGGAGCCCCGCGAGAAGTACCTGACTTGGGCACCCGTGCTGGACTCCGACGGCTCCTTCTTCCTCTATTCGATACTGCGCGTGGACGCAAGCAGGTGGCAGCAGGGGAACGTCTTCTCATGCTCCGTGATGCATGAGGCTCTGCACAACCACTACACGCAGAAGAGCCTCTCCCTGtccccgggDNA fragment Ngo MIV/Sma I, containing sequence encoding GA(f1) SEED (underlined): SEQ ID NO: 32gccggctcggcccaccctctgccctgagagtgaccgctgtaccaacctctgtccctacaGGGCAGCCCCGAGAACCACAGGTGTACACCCTGCCCCCACCGTCGGAGGAGCTGGCCCTGAACGAGCaGGTGACGCTGACCTGCCTGGTCAAAGGCTTCTATCCCAGCGACATCGCCGTGGAGTGGCTGCAGGGGTCCCAGGAGCTGCCCCGCGAGAAGTACCTGACTTGGaCcCCCGTGgTGGACTCCGACGGCTCCTTCTTCCTCTATAGTATACTGCGCGTGaCAGCCGAtGACTGGAAGAAGGGGGACACCTTCTCATGCTCCGTGATGCATGAGGCTCTGCACAACCACTACACCTCAGAAGAGCCTCGACCGCtccccgggDNA fragment Ngo MIW/Sma I, containing sequenceencoding GA(f2) SEED (underlined): SEQ ID NO: 53gccggctcggcccaccctctgccctgagagtgaccgctgtaccaacctctgtccctacaGGGCAGCCCCGAGAACCACAGGTGTACACCCTGCCCCCACCGTCGGAGGAGCTGGCCCTGAACGAGCaGGTGACGCTGACCTGCCTGGTCAAAGGCTTCTATCCCAGCGACATCGCCGTGGAGTGGCTGCAGGGGTCCCAGGAGCTGCCCCGCGAGAAGTACCTGACTTGGgCaCCCGTGgacGACTCCGACGGCTCCcaCTTCCTCTATAGTATACTGCGCGTGaCAGCCGAtGACTGGAAGAAGGGGGACACCTTCTCATGCTCCGTGATGCATGAGGCTCTGCACAACCACTACACGCAGAAGAGCCTCGACCGCtccccgggIn addition, a polypeptide containing GA(f3) SEEDmay be encoded by the following DNA sequence:DNA fragment Ngo MIW/Sma I, containing sequenceencoding GA(f3) SEED (underlined): SEQ ID NO: 54gccggctcggcccaccctctgccctgagagtgaccgctgtaccaacctctgtccctacaGGGCAGCCCCGAGAACCACAGGTGTACACCCTGCCCCCACCGTCGGAGGAGCTGGCCCTGAACGAGCaGGTGACGCTGACCTGCCTGGTCAAAGGCTTCTATCCCAGCGACATCGCCGTGGAGTGGCTGCAGGGGTCCCAGGAGCTGCCCCGCGAGAAGTACCTGACTTGGaCcCCCGTGaccGACTCCGACGGCTCCgacTTCCTCTATAGTATACTGCGCGTGaCAGCCGAtGACTGGAAGAAGGGGGACACCTTCTCATGCTCCGTGATGCATGAGGCTCTGCACAACCACTACACGCAGAAGAGCCTCGACCGCtccccggg

These synthetic sequences were additionally extended at their 3′ endwith an approximately 50 bp stretch of random DNA so as to allow easyseparation of excised Ngo MIV/Sma I desired insert fragment and asimilarly sized plasmid vector fragment during fragment purification.The gel purified Ngo MIV/Sma I fragments were then ligated to asimilarly treated pdCs vector containing either an Fc moiety or anFc-IL2 moiety, or alternatively, to a similarly treated pdHL vectorcontaining either a DI-KS or a DI-KS-IL2 moiety. Thus, for example,pdCs-HuFc(AG(f0))-IL2, containing the Ngo MIV/Sma I fragment for AG(f0)SEED (SEQ ID NO:28), and pdCs-HuFc(GA(f0)), containing the Ngo MIV/Sma Ifragment for GA(f0) SEED (SEQ ID NO:30), were obtained.pdCs-HuFc(AG(f0))-IL2 and pdCs-HuFc(GA(f0)) encode an Fc(AG(f0)SEED)-IL-2 polypeptide chain and an Fc(GA(f0) SEED) polypeptide chain,respectively. Exemplary sequences of Fc(AG(f0) SEED)-IL-2 and ofFc(GA(f1) SEED) are shown as SEQ ID NO:33 and SEQ ID NO:34,respectively, below. A diagram of the resulting heterodimeric protein isshown in FIG. 11A. To obtain simultaneous expression of both polypeptidechains from a host cell, the transcription units for these Fcpolypeptides were combined on a single expression vector as describedbelow in Example 5.

Polypeptide sequence of a Fc(AG(f0) SEED)-IL2: SEQ ID NO: 33EPKSSDKTHTCPPCPAPELLGGPSVFLFPPKPKDTLMISRTPEVTCVVVDVSHEDPEVKFNWYVDGVEVHNAKTKPREEQYNSTYRVVSVLTVLHQDWLNGKEYKCKVSNKALPAPIEKTISKAKGQPFRPEVHLLPPSREEMTKNQVSLTCLARGFYPKDIAVEWESNGQPENNYKTTPSRQEPSQGTTTFAVTSKLTVDKSRWQQGNVFSCSVMHEALHNHYTQKTISLSPGKAPTSSSTKKTQLQLEHLLLDLQMILNGINNYKNPKLTRMLTFKFYMPKKATELKHLQCLEEELKPLEEVLNLAQSKNFHLRPRDLISNINVIVLELKGSETTFMCEYADETATIV EFLNRWITFCQSIISTLTPolypeptide sequence of a Fc(GA(f0) SEED): SEQ ID NO: 34EPKSSDKTHTCPPCPAPELLGGPSVFLFPPKPKDTLMISRTPEVTCVVVDVSHEDPEVKFNWYVDGVEVHNAKTKPREEQYNSTYRVVSVLTVLHQDWLNGKEYKCKVSNKALPAPIEKTISKAKGQPREPQVYTLPPPSEELALNELVTLTCLVKGFYPSDIAVEWLQGSQELPREKYLTWAPVLDSDGSFFLYSILRVAAEDWKKGDTFSCSVMHEALHNHYTQKSLDRSPGK

Similarly, pdHL-DI-KS(AG(f0))-IL2, containing the Ngo MIV/Sma I fragmentfor AG(f0) SEED (SEQ ID NO:28), and pdHL-DI-KS(GA(f0)), containing theNgo MIV/Sma I fragment for GA(f0) SEED (SEQ ID NO:30), were obtained.pdHL-DI-KS(AG(f0))-IL2 and pdHL-DI-KS(GA(f0)) encode DI-KS(AG(f0)SEED)-IL-2 heavy chain (SEQ ID NO:35), DI-KS(GA(f0) SEED) heavy chain(SEQ ID NO:36), respectively. Both expression vectors also encode theDI-KS light chain (SEQ ID NO:37).

Polypeptide sequence of DI-KS(AG(f0) SEED)-IL2 heavy chain:SEQ ID NO: 35 QIQLVQSGPELKKPGSSVKISCKASGYTFTNYGMNWVRQAPGKGLKWMGWINTYTGEPTYADDFKGRFTITAETSTSTLYLQLNNLRSEDTATYFCVRFISKGDYWGQGTTVTVSSASTKGPSVFPLAPSSKSTSGGTAALGCLVKDYFPEPVTVSWNSGALTSGVHTFPAVLQSSGLYSLSSVVTVPSSSLGTQTYICNVNHKPSNTKVDKRVEPKSCDKTHTCPPCPATELLGGPSVFLFPPKPKDTLMISRTPEVTCVVVDVSHEDPEVKFNWYVDGVEVHNAKTKPREEQYNSTYRVVSVLTVLHQDWLNGKEYKCKVSNKALPAPIEKTISKAKGQPFRPEVHLLPPSREEMTKNQVSLTCLARGFYPKDIAVEWESNGQPENNYTTPSRQEPSQGTTTFAVTSKLTVDKSRWQQGNVFSCSVMHEALHNHYTQKTISLSPGAAPTSSSTKKTQLQLEHLLLDLQMILNGINNYKNPKLTRMLTFKFYMPKKATELKHLQCLEEELKPLEEVLNLAQSKNFHLRPRDLISNINVIVLELKGSETTFMCEYADETATIVEFLNRWITFCQSIISTLTPolypeptide sequence of DI-KS(GA(f0) SEED) heavy chain: SEQ ID NO: 36QIQLVQSGPELKKPGSSVKISCKASGYTFTNYGMNWVRQAPGKGLKWMGWINTYTGEPTYADDFKGRFTITAETSTSTLYLQLNNLRSEDTATYFCVRFISKGDYWGQGTTVTVSSASTKGPSVFPLAPSSKSTSGGTAALGCLVKDYFPEPVTVSWNSGALTSGVHTFPAVLQSSGLYSLSSVVTVPSSSLGTQTYICNVNHKPSNTKVDKRVEPKSCDKTHTCPPCPAPELLGGPSVFLFPPKPKDTLMISRTPEVTCVVVDVSHEDPEVKFNWYVDGVEVHNAKTKPREEQYNSTYRVVSVLTVLHQDWLNGKEYKCKVSNKALPAPIEKTISKAKGQPREPQVYTLPPPSEELALNELVTLTCLVKGFYPSDIAVEWLQGSQELPREKYLTWAPVLDSDGSFFLYSILRVAAEDWKKGDTFSCSVMHEALHNHYTQK SLDRSPGKPolypeptide sequence of DI-KS light Chain: SEQ ID NO: 37QIVLTQSPASLAVSPGQRATITCSASSSVSYILWYQQKPGQPPKPWIFDTSNLASGFPSRFSGSGSGTSYLLTINSLEAEDAATYYCHQRSGYPYTFGGGTKVEIKRTVAAPSVFIFPPSDEQLKSGTASVVCLLNNFYPREAKVQWKVDNALQSGNSQESVTEQDSKDSTYSLSSTLTLSKADYEKHKVYACEVT HQGLSSPVTKSFNRGEC

To obtain a single expression vector expressing both DI-KS(AG(f0)SEED)-IL-2 and DI-KS(GA(f0) SEED) heavy chain transcription units aswell as the common light chain transcription unit, a construct wasprepared essentially as follows: an approximately 3.9 kb Sal I/Mfe Ifragment containing the sequence encoding KS(AG(f0) SEED)-IL-2 wasexcised from the pdHL-10 expression construct (pdHL-10 is a latergeneration pdHL expression vector containing a single Sal I site outsideof the transcription unit) and ligated into a Sal I/Bam HI digested pBSplasmid, together with a Bam HI/Mfe I duplex linker fragment. Thisduplex linker fragment is composed of Oligo11 (SEQ ID NO:38) and Oligo12(SEQ ID NO:39) and contains an internal Sal I site.

Oligo11  (SEQ ID NO: 38) AATTGCCGGGTCGACATACG Oligol2 (SEQ ID NO: 39)GATCCGTATGTCGACCCGGCThe 3.9 kb fragment was then excised from pBS as a Sal I fragment andinserted into the unique Sal I site of a pdHL-10 expression constructalready containing the transcription units encoding DI-KS(GA(f0) SEED)heavy chain and the DI-KS light chain.

Example 5: Assay to Determine Heterodimeric Fc Molecules ContainingCH3-Based SEEDs

The examples described here involve CH3 dimerization, which is animportant step in nucleating the formation of Fc and immunoglobulinheavy chain dimers. In theory, if two distinct Fc moieties (e.g., termedA and B) containing CH3 domains are expressed simultaneously in a cell,they could pair and form Fc dimeric molecules in the followingconfigurations: A:A, A:B, and B:B. If the CH3 domains and hinge domainsare identical, the configurations A:A, A:B, and B:B are expected tooccur in a 1:2:1 ratio if A and B are expressed in equal amounts. Therelative amounts, the kinetics and thermodynamics of A-A, A-B, and B-Binteractions are important governing factors for the observed ratio ofthese three final species, as would the expression levels. In general,when protein A and protein B are expressed in relative amounts [A] and[B], where [A]+[B]=1, and homodimers and heterodimers are produced inrelative concentrations [A−A], [A−B], and [B−B], if there is unbiasedassociation, these dimeric species will respectively be present in aratio of [A]²:2*[A]*[B]:[B]². If the relative concentration[A−B]>2*[A]*[B], then heterodimerization is favored, while if therelative concentration [A−B]<2*[A]*[B], then homodimerization isfavored. For a preferred SEED pair, the ratio [A−B]/2*[A]*[B] is greaterthan 2, and preferably greater than 3, and more preferably greater than5.

To determine the ratios of the different species, one needs a way todistinguish them by an assay. An easy way to do this is to attach afusion partner to one of the Fc subunits (e.g., “A”), which would resultin each of the three final species having a significantly differentmolecular weight. Accordingly, constructs were prepared to express bothhuman Fc (HuFc) and human Fc fused to human IL-2 (HuFc-IL-2) in onecell. The constructs were prepared as follows: The gene for HuFc wasexcised from a vector containing an Fc moiety (see, for example, Lo etal., Protein Engineering [1998] 11:495) by enzymatic restriction at a 5′XbaI site and a 3′ XhoI site. The 1.4 Kb fragment containing the HuFcgene was gel purified and subcloned into a second vector,pdCS-MuFc-KS-kappa, replacing its muFc with HuFc. The HuFc gene wasflanked by two SalI sites outside the promoter region.

A third vector containing a gene coding for HuFc-IL-2 and a single SalIsite was chosen to receive the HuFc gene. The vector was cut with SalI,treated with Calf Intestinal Phosphatase (CIP) and gel purified. Thesecond vector was digested with Sail and a 2.5 Kb fragment was gelpurified. This fragment contained the HuFc gene and a promoter, and wasinserted into the gel-purified third vector. The final resulting vectorcontained two different transcription units with duplicated versions ofthe same regulatory elements, one transcription unit controlling theexpression of wild type HuFc and the other controlling the expression ofwild type HuFc-IL-2. Expression constructs containing SEED-based HuFcand SEED-based HuFc-IL-2 were similarly made.

This final vector was expanded using Qiagen maxi-prep. 10 mg of the DNAwas used to transiently transfect baby hamster kidney (BHK) cells, usingthe Lipofectamine TM2000 kit (Invitrogen). Cells were split, half grownin regular medium, the other half in serum-free medium, for two days.Supernatants (e.g., 100 ul) were harvested. 10 microliters of protein-Abeads were added and mixed overnight at 4.degree. C. to bind theprotein. After washing 3.times. with PBS containing 1% Triton-X100,samples were loaded onto Nu-Page (Invitrogen) 4-12% gradient Bis-Trisgels, under both reducing and non-reducing conditions. Gels were stainedwith colloidal blue (Invitrogen) for direct protein visualization.

Typical control results are shown in lanes 8-10 in the gels shown inFIGS. 12D-E. The reducing gel in FIG. 12C shows the ratio of HuFc andHuFc-IL-2 subunits. The non-reducing gel in FIG. 12B shows that the HuFcand HuFc-IL2 molecules dimerize randomly, with no preference forheterodimerization as compared to homodimerization.

Gels were also transferred to nitrocellulose membranes for Western blotanalysis. In the Western blots, protein was detected in two ways inorder to measure both the Fc and the IL-2. Antibodies against human IgGFc (Jackson Immunolabs) conjugated to horseradish peroxidase (HRP) wereused to detect Fc. The blots were detected with ECL substrate and filmexposure. A biotinylated antibody against human IL-2 (R&D systems) wasused to detect IL-2, and the signal was developed by adding avidinconjugated to HRP, and detecting with ECL substrate and film exposure.These experiments confirmed the identity of bands shown in FIGS. 12D-E.

To measure the levels of heterodimers and homodimers formed duringexpression of “Full” GA SEED/AG SEED and “Surface” GA SEED/AG SEEDproteins, similar experiments were performed. Single expression vectorconstructs expressing an AG SEED-IL2 fusion protein and a GA SEEDprotein were constructed as described above for the expression ofFc/Fc-IL2. As shown in lanes 2-4 in FIGS. 12B and 12C, when the “Full”GA SEED (Fc(GA(f0) SEED) and the “Full” AG SEED-IL-2 (Fc(AG(f0)SEED)-IL2) proteins were co-expressed in NS/O cells, heterodimerizationwas strongly preferred, with no detectable Fc(AG SEED)-IL2 homodimers,and only very small amounts of the Fc(GA SEED) homodimers were detected.Similarly, as shown in lanes 5-7 in FIGS. 12B and 12C, when the“Surface” GA SEED (Fc(GA(s0) SEED) and the “Surface” AG SEED-IL-2(Fc(AG(s0) SEED)-IL2) proteins were co-expressed in NS/0 cells,heterodimerization was strongly favored, with no detectable Fc(AGSEED)-IL-2 homodimers, and only small amounts of the Fc(GA SEED)homodimer detected. It was estimated that heterodimers constitutedabout >90% of the total amount of the proteins assembled in the cell.

Example 6. Construction, Expression, and Heterodimerization Propertiesof SEED Molecules with Reduced Immunogenicity

Because the AG and GA SEED protein sequences are hybrids between twodifferent naturally occurring human sequences, these sequences includepeptide segments that are not found in normal human proteins and thatmay be processed into non-self MHC Class II T cell epitopes. Therefore,the following sequences were designed to reduce the number of potentialnon-self T-cell epitopes in the AG SEED and GA SEED sequences, depictedby the polypeptide sequence shown in SEQ ID NO: 1 and SEQ ID NO:2,respectively, wherein X₁, X₂, X₃, X₄, X₅, or X₆ may be any amino acid.In some embodiments, in SEQ ID NO:1, X₁ is S, X₂ is V or T, and X₃ is S.In some embodiments, in SEQ ID NO:2, X₁ is Q, X₂ is A or T, X₃ is L, V,D, or T, X₄ is F, A, D, E, G, H, K, N, P, Q, R, S, or T, X₅ is T, and X₆is D.

Polypeptide sequence of AG SEED, with variant amino acids X₁-X₃:SEQ ID NO: 1 GQPFRPEVHLLPPSREEMTKNQVSLTCLARGFYPX₁DIAVEWESNGQPENNYKTTPSRQEPSQGTTTFAVTSKLTX₂DKSRWQQGNVFSCSVMHEALHNHYT QKX₃ISLPolypeptide sequence of GA SEED, with variant amino acids X₁-X₆:SEQ ID NO: 2 GQPREPQVYTLPPPSEELALNEX₁VTLTCLVKGFYPSDIAVEWLQGSQELPREKYLTWX₂PVX₃DSDGSX₄FLYSILRVX₅AX₆DWKKGDTFSCSVMHEALH NHYTQKSLDR

The DNA molecule (SEQ ID NO:32) encoding exemplary SEED variant GA(f1)SEED (SEQ ID NO:7) was made by de novo synthesis and was introduced intothe pdCs expression vector as described in Example 4, producing thepolypeptide Fc(GA(f1) SEED).

Polypeptide sequence of GA(f1) SEED: SEQ ID NO: 7GQPREPQVYTLPPPSEELALNEQVTLTCLVKGFYPSDIAVEWLQGSQELPREKYLTWTPVVDSDGSFFINSILRVTADDWKKGDTFSCSVMHEAHNHY TQKSLDR

Mutations were introduced into the exemplary variant SEED moieties,AG(f1) SEED (SEQ ID NO:4), AG(f2) SEED (SEQ ID NO:5), and GA(f2) SEED(SEQ ID NO:8), by a two-step PCR approach in which two mutagenized,partially overlapping PCR fragments from a first round of PCRamplification are combined in a second round of PCR amplification togenerate the final full-length fragment, using standard methods familiarto those skilled in the art. Essentially, two PCR reactions wereperformed in the first round, each with a PCR primer incorporating themutant sequence paired with an appropriate flanking primer containingsuitable restriction sites, Ngo MIV for the upstream primer and Sma Ifor the downstream primer, and a DNA template encoding the appropriateparent SEED moiety. The same flanking PCR primers were used in thesecond PCR amplification reaction, using the products of the first PCRamplification as templates. The resultant fragment was cloned into apCR2.1 vector (Invitrogen) and its sequence was verified. Finally, the0.4 kb Ngo MIV/Sma I DNA fragment was excised from the vector, gelpurified, and ligated into a similarly treated recipient expressionplasmid, as described in Example 4.

Specifically, for AG(f1) SEED, primer pairs Oligo1 (SEQ ID NO:40)/Oligo2(SEQ ID NO:41) and Oligo3 (SEQ ID NO:42)/Oligo4 (SEQ ID NO:43) withtemplate pdCs-Fc(AG(f0) SEED)-IL2 were used in the first round of PCRreactions. Oligo1 (SEQ ID NO:40)/Oligo4 (SEQ ID NO:43) were used in thesecond round of PCR reactions, generating the DNA fragment shown in SEQID NO:44 which was introduced into pdCs-Fc(AG(f0) SEED)-IL2. For AG(f1)SEED, primer pairs Oligo1 (SEQ ID NO:40)/Oligo5 (SEQ ID NO:45) andOligo6 (SEQ ID NO:46)/Oligo4 (SEQ ID NO:43) with template pCR2.1containing the sequence shown in SEQ ID NO:44 were used in the firstround of PCR reactions. Oligo1 (SEQ ID NO:40)/Oligo4 (SEQ ID NO:43) wereused in the second round of PCR reactions, generating the DNA fragmentshown in SEQ ID NO:47 which was introduced into pdCs-Fc(AG(f0)SEED)-IL2. For GA(f2) SEED, primer pairs Oligo1 (SEQ ID NO:40)/Oligo10(SEQ ID NO:48) and Oligo7 (SEQ ID NO:49)/Oligo9 (SEQ ID NO:50) withtemplate carrier plasmid pUC containing the sequence shown in SEQ IDNO:32 were used in the first round of PCR reactions. Oligo1 (SEQ IDNO:40)/Oligo9 (SEQ ID NO:50) were used in the second round of PCRreactions, generating the DNA fragment shown in SEQ ID NO:47 which wasintroduced into pdCs-Fc(GA(f2) SEED). All the sequences referred toabove are shown below.

Oligo1 (SEQ ID NO: 40) GCCGGCTCGGCCCACCCTCT Oligo2 (SEQ ID NO: 41)CGGCGATGTCGCTGGGATAGAA Oligo3 (SEQ ID NO: 42) TTCTATCCCAGCGACATCGCCGOligo4 (SEQ ID NO: 43) CCCGGGGACAGGGAGATGGACTTCTGCGTGT Oligo5(SEQ ID NO: 45) GCTCTTGTCTGTGGTGAGCTT Oligo6 (SEQ ID NO: 46)AAGCTCACCACAGACAAGAGC Oligo7 (SEQ ID NO: 49)CCTGACTTGGGCACCCGTGGACGACTCCGACGGCTCCCACTTCCTCTATA Oligo9(SEQ ID NO: 50) CCCGGGGAGCGGTCGAGGCTC Oligo10 (SEQ ID NO: 48)TATAGAGGAAGTGGGAGCCGTCGGAGTCGTCCACGGGTGCCCAAGTCAGGDNA fragment Ngo MIV/Sma I, containing sequenceencoding AG(f1) SEED (underlined): SEQ ID NO: 44gccggctcggcccaccctctgccctgagagtgaccgctgtaccaacctctgtccctacaGGGCAGCCCTTCCGGCCAGAGGTCCACCTGCTGCCCCCATCACGGGAGGAGATGACCAAGAACCAGGTCAGCCTGACCTGCCTGGCACGCGGCTTCTATCCCAgcGACATCGCCGTGGAGTGGGAGAGCAATGGGCAGCCGGAGAACAACTACAAGACCACGCCTTCCCGGCAGGAGCCCAGCCAGGGCACCACCACCTTCGCTGTGACCTCGAAGCTCACCGTGGACAAGAGCAGATGGCAGCAGGGGAACGTCTTCTCATGCTCCGTGATGCATGAGGCTCTGCACAACCACTACACGCAGAAGtCCATCTCCCTGtccccgggDNA fragment Ngo MIV/Sma I, containing sequenceencoding AG(f2) SEED (underlined): SEQ ID NO: 47gccggctcggcccaccctctgccctgagagtgaccgctgtaccaacctctgtccctacaGGGCAGCCCTTCCGGCCAGAGGTCCACCTGCTGCCCCCATCACGGGAGGAGATGACCAAGAACCAGGTCAGCCTGACCTGCCTGGCACGCGGCTTCTATCCCAgcGACATCGCCGTGGAGTGGGAGAGCAATGGGCAGCCGGAGAACAACTACAAGACCACGCCTTCCCGGCAGGAGCCCAGCCAGGGCACCACCACCTTCGCTGTGACCTCGAAGCTCACCacaGACAAGAGCAGATGGCAGCAGGGGAACGTCTTCTCATGCTCCGTGATGCATGAGGCTCTGCACAACCACTACACGCAGAAGtCCATCTCCCTGtccccggg

Fc(AG(f1) SEED), Fc(AG(f2) SEED), Fc(GA(f1) SEED)-IL2 and Fc(GA(f2)SEED)-IL2 sequences were expressed individually and in combinations inHEK 293T cells, and the resulting secreted proteins were partiallypurified based on Fc binding to Staphylococcus A protein andcharacterized by SDS-PAGE. When the samples were run on a reducing SDSgel, it was apparent that the Fc(AG(f1) SEED) and Fc(AG(f2) SEED)proteins were expressed very poorly by themselves, which is similar tothe parent Fc(AG(f0) SEED) protein. Without wishing to be bound bytheory, the poor expression most likely results from the proteolysis ofthe monomeric protein that has no dimerization partner. The Fc(GA(f1)SEED)-IL2 protein was expressed at high level, while the Fc(GA(f2)SEED)-IL2 protein, differing by the additional amino acid substitutionVal75Thr, was expressed at a very low level. Again, without wishing tobe bound by theory, the poor expression may result from the proteolysisof the monomeric protein that has no dimerization partner. Thecombinations Fc(AG(f1) SEED) plus Fc(GA(f1) SEED)-IL2, Fc(AG(f2) SEED)plus Fc(GA(f1) SEED)-IL2, Fc(AG(f1) SEED) plus Fc(GA(f2) SEED)-IL2, andFc(AG(f2) SEED) plus Fc(GA(f2) SEED)-IL2, were tested and all wereexpressed at high levels. The same samples were run on a non-reducinggel and confirmed these results. This analysis indicated that, for thecombinations, essentially all of the expressed protein washeterodimeric. These results indicate that certain variant GA and AGSEED proteins with reduced immunogenicity retain their preference forheterodimerization.

Example 7. Expression of an Antibody-Cytokine Fusion Protein Using SEEDFc Regions

To further demonstrate the versatility of the SEED-based Fc regions, anintact antibody with a single IL-2 moiety was constructed as describedin Example 4. A diagram of this protein is shown in FIG. 11B.Specifically, the protein contained antibody V regions that bind toEpCAM and that have the sequences as described in U.S. Pat. No.6,696,517, human IgG1 CH1 and CH2 domains, human Ckappa, the GA and AGSEED domains, and human IL-2 fused to the C-terminus of the AGSEED-containing heavy chain.

The protein was expressed in mammalian cells according to standardtechniques producing a protein with the polypeptide chains shown in SEQID NO:37, SEQ ID NO:36, and SEQ ID NO:35.

The resulting protein was characterized to determine the extent to whichheterodimeric forms were secreted from the mammalian cells. For example,the secreted protein was characterized by non-reducingSDS-polyacrylamide gel electrophoresis. In principle, three bands mightbe identified, corresponding to antibodies with no, one or two IL-2moieties. The actual non-reducing gel showed predominantly a single bandwith a molecular weight corresponding to an antibody with a single IL-2moiety. A much less intense band with a molecular weight correspondingto no IL-2 moieties was seen, and a band with a molecular weightcorresponding to two IL-2 moieties was not detectable. When the sampleswere reduced before running on the gel, approximately equal amounts ofprotein corresponding to antibody heavy chain and heavy chain-IL2 weredetected.

The foregoing description of the present invention provides illustrationand description, but is not intended to be exhaustive or to limit theinvention to the precise one disclosed. Modifications and variationsconsistent with the above teachings may be acquired from practice of theinvention. Thus, it is noted that the scope of the invention is definedby the claims and their equivalents

INCORPORATION BY REFERENCE

All sequence and structure access numbers, publications and patentdocuments cited in this application are incorporated by reference intheir entirety for all purposes to the same extent as if the contents ofeach individual publication or patent document were incorporated herein.

What is claimed is: 1-30. (canceled)
 31. A method of designing amultidomain protein with domains that preferentially heterodimerize, themethod comprising the steps of: (a) selecting a first polypeptide, asecond polypeptide, a third polypeptide and a fourth polypeptide,wherein the first and third polypeptides dimerize with each other, butnot with the second or fourth polypeptide, and wherein said second andfourth polypeptides dimerize with each other, (b) composing an aminoacid sequence of a first domain from the first and the secondpolypeptides comprising at least one assembly element from the firstpolypeptide, and (c) composing an amino acid sequence of a second domainfrom the third and fourth polypeptides comprising at least one assemblyelement from the third polypeptide, such that the assembly elements fromthe first and third polypeptides assemble with each other, therebypromoting heterodimerization of the first and second domains.
 32. Themethod of claim 31, wherein the first domain further comprises anassembly element from the second polypeptide and the second domainfurther comprises a assembly element from the fourth polypeptide suchthat the assembly elements from the second and fourth polypeptides bindassemble with other promoting heterodimerization of the first and seconddomains.
 33. The method of claim 31, wherein step (b) or step (c)comprises comparing three-dimensional structures of two or more of thefirst, second, third or fourth polypeptides.
 34. The method of claim 31,wherein the first and third polypeptides are identical.
 35. The methodof claim 34, wherein the second and fourth polypeptides are identical.